Analyzing textual documents
First Claim
Patent Images
1. A method of processing a first text document in a source language and a second text document in a target language using a computer, each of the documents being divided into segments and stored in memory means of the computer, the segments being further divided into words, the method comprising carrying out the steps of:
- a) selecting a first word in the first text document and a second word in the second text document and determining a representation of the probability that the first and second words have substantially the same meaning by taking into account the result of a comparison of the distribution of the first and second words in the segments of the first text document and the second text document respectively; and
b) determining that the first and second words have substantially the same meaning when the representation of the probability that the first and second words have the same meaning is greater than a threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
Two text documents, one of which is a (possibly erroneous) translation of the other are stored in the memory of a text processing system. The distributions and forms of words within the documents are compared and by a statistical process it is determined which words are likely to be correctly translated. A list of those words which seem to have been inconsistently translated is compiled.
35 Citations
5 Claims
-
1. A method of processing a first text document in a source language and a second text document in a target language using a computer, each of the documents being divided into segments and stored in memory means of the computer, the segments being further divided into words, the method comprising carrying out the steps of:
-
a) selecting a first word in the first text document and a second word in the second text document and determining a representation of the probability that the first and second words have substantially the same meaning by taking into account the result of a comparison of the distribution of the first and second words in the segments of the first text document and the second text document respectively; and b) determining that the first and second words have substantially the same meaning when the representation of the probability that the first and second words have the same meaning is greater than a threshold. - View Dependent Claims (2, 3, 4, 5)
-
Specification