FUZZY MATCHING AND SCORING BASED ON DIRECT ALIGNMENT
First Claim
1. A computer-implemented method comprising:
- conducting a direct alignment operation to ascertain anchor points between a primary sentence for which a translation is sought and a comparison sentence for which a translation exists;
finding changes between anchor points in the primary and comparison sentences;
finding moves between the primary and comparison sentences, wherein changes and moves constitute deductions that are utilized to calculate a final score for the primary and comparison sentences; and
using the changes and the moves to calculate an overall score for the primary and comparison sentences.
2 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments provide a translation memory system that utilizes sentence-level fuzzy matching and a scoring algorithm based on direct alignment. In one or more embodiments, a fuzzy match scoring formula includes use of an edit operation definition to define various deductions that are computed as part of an overall score, an overall scoring algorithm, and word-level scoring and partial match definitions. A direct alignment algorithm finds a computed alignment between two sentences using a pair-wise difference matrix associated with a primary sentence and a comparison sentence. An overall algorithm identifies editing operations such as replacements, position swaps and adjustments for a final score calculation. Once final scores are calculated between the primary sentence and multiple comparison sentences, a primary/comparison sentence pair can be selected, based on the score, to serve as a basis for translating the primary sentence.
50 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
conducting a direct alignment operation to ascertain anchor points between a primary sentence for which a translation is sought and a comparison sentence for which a translation exists; finding changes between anchor points in the primary and comparison sentences; finding moves between the primary and comparison sentences, wherein changes and moves constitute deductions that are utilized to calculate a final score for the primary and comparison sentences; and using the changes and the moves to calculate an overall score for the primary and comparison sentences. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. One or more computer readable storage media embodying computer readable instructions which, when executed, implement a method comprising:
-
building a two-dimensional array that is to serve as a basis for token-to-token comparison between a primary sentence for which a translation is sought and a comparison sentence for which a translation exists, wherein the two-dimensional array includes individual values associated with matches between tokens of the primary and comparison sentences; and finding a path through the array with a least token-to-token comparison, wherein said path defines one or more anchor points between tokens of the primary and comparison sentences. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification