Method and apparatus for translation based on a repository of existing translations
First Claim
1. A method performed in a Translation Memory Apparatus for translating an input sequence of data items in a first format to an output sequence of data items in a second format using a store comprising a plurality of example sequences in the first format each paired with its translation in the second format, comprising:
- (a) a processor of the apparatus choosing a base example sequence from the store based on a comparison of the input sequence with each of a plurality of example sequences from the store, and using its paired translation as a translation basis;
(b) the processor identifying a portion of the input sequence differing from a corresponding portion of the base example sequence, these portions being designated input and base example unmatched portions respectively and other portions that are not the unmatched portions being designated input and base example matched portions respectively;
(c) the processor locating a portion of the translation basis corresponding to the base example unmatched portion, wherein, when the portion of the translation basis corresponds to the base example unmatched portion and an adjacent base example matched portion, extending the base example unmatched portion to include the adjacent base example matched portion, and extending the corresponding input unmatched portion to include an adjacent input matched portion corresponding to the adjacent base example matched portion;
(d) the processor using the input unmatched portion to select a set of subsidiary example sequences from the store;
(e) the processor determining from the set of subsidiary example sequences a choice of possible translations corresponding to the input unmatched portion;
(f) the processor selecting a translation from the choice based on a predetermined selection algorithm and using the selected translation to replace the portion located in step (c); and
(g) the processor using the result of step (f) as a basis for the output sequence of data items.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of translating an input sentence in a source language to an output sentence in a target language uses a store comprising a plurality of example sentences. A base example sentence is chosen from the store based on a comparison of the input sentence with a plurality of example sentences, and its paired translation is used as a translation basis. A portion of the input sentence differing from a corresponding portion of the base example sentence is identified. A portion of the translation basis aligned with the base example unmatched is located. The input unmatched portion is used to select a set of subsidiary example sentences from the store. A choice of possible translations corresponding to the input unmatched portion is determined from the set of subsidiary example sentences. A translation is selected from the choice based on a predetermined selection algorithm.
72 Citations
33 Claims
-
1. A method performed in a Translation Memory Apparatus for translating an input sequence of data items in a first format to an output sequence of data items in a second format using a store comprising a plurality of example sequences in the first format each paired with its translation in the second format, comprising:
-
(a) a processor of the apparatus choosing a base example sequence from the store based on a comparison of the input sequence with each of a plurality of example sequences from the store, and using its paired translation as a translation basis; (b) the processor identifying a portion of the input sequence differing from a corresponding portion of the base example sequence, these portions being designated input and base example unmatched portions respectively and other portions that are not the unmatched portions being designated input and base example matched portions respectively; (c) the processor locating a portion of the translation basis corresponding to the base example unmatched portion, wherein, when the portion of the translation basis corresponds to the base example unmatched portion and an adjacent base example matched portion, extending the base example unmatched portion to include the adjacent base example matched portion, and extending the corresponding input unmatched portion to include an adjacent input matched portion corresponding to the adjacent base example matched portion; (d) the processor using the input unmatched portion to select a set of subsidiary example sequences from the store; (e) the processor determining from the set of subsidiary example sequences a choice of possible translations corresponding to the input unmatched portion; (f) the processor selecting a translation from the choice based on a predetermined selection algorithm and using the selected translation to replace the portion located in step (c); and (g) the processor using the result of step (f) as a basis for the output sequence of data items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
15. The method as claimed in claim I, wherein step (e) comprises the processor identifying, for a subsidiary example in the set, a portion of the subsidiary example corresponding to the input unmatched portion, and using a corresponding portion of the translation paired to the subsidiary example to form one of the possible translations in the choice.
-
32. A Translation Memory system comprising apparatus for translating an input sequence of data items in a first format to an output sequence of data items in a second format using a store comprising a plurality of example sequences in the first format each paired with its translation in the second format, the apparatus comprising:
-
a unit which chooses a base example sequence from the store based on a comparison of the input sequence with each of a plurality of example sequences from the store, and uses its paired translation as a translation basis; a unit which identifies a portion of the input sequence differing from a corresponding portion of the base example sequence, these portions being designated input and base example unmatched portions respectively and other portions that are not the unmatched portions being designated input and base example matched portions respectively; a unit which locates a portion of the translation basis corresponding to the base example unmatched portion, wherein, when the portion of the translation basis corresponds to the base example unmatched portion and an adjacent base example matched portion, extending the one of the base example unmatched portion to include the adjacent base example matched portion, and extending the corresponding input unmatched portion to include an adjacent input matched portion corresponding to the adjacent base example matched portion; a unit which uses the input unmatched portion to select a set of subsidiary example sequences from the store; a unit which determines from the set of subsidiary example sequences a choice of possible translations corresponding to the input unmatched portion; a unit which selects a translation from the choice based on a predetermined selection algorithm and uses the selected translation to replace the portion located by the locating unit; and a unit which uses the result of the selecting unit as a basis for the output sequence of data items.
-
-
33. A computer readable recording medium having stored thereon a computer executable program for translating an input sequence of data items in a first format to an output sequence of data items in a second format using a store comprising a plurality of example sequences in the first format each paired with its translation in the second format, comprising:
-
(a) choosing a base example sequence from the store based on a comparison of the input sequence with each of a plurality of example sequences from the store, and using its paired translation as a translation basis; (b) identifying a portion of the input sequence differing from a corresponding portion of the base example sequence, these portions being designated input and base example unmatched portions respectively and other portions that are not the unmatched portions being designated input and base example matched portions respectively; (c) locating a portion of the translation basis corresponding to the base example unmatched portion, wherein, when the portion of the translation basis corresponds to the base example unmatched portion and an adjacent base example matched portion, extending the base example unmatched portion to include the adjacent base example matched portion, and extending the corresponding input unmatched portion to include an adjacent input matched portion corresponding to the adjacent base example matched portion; (d) using the input unmatched portion to select a set of subsidiary example sequences from the store; (e) determining from the set of subsidiary example sequences a choice of possible translations corresponding to the input unmatched portion; (f) selecting a translation from the choice based on a predetermined selection algorithm and using the selected translation to replace the portion located in step (c); and (g) using the result of step (f) as a basis for the output sequence of data items.
-
Specification