MACHINE TRANSLATION-DRIVEN AUTHORING SYSTEM AND METHOD
First Claim
1. An authoring method comprising:
- generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language;
receiving source text comprising initial text entered by a user through the authoring interface in the source language;
with a processor, selecting a set of source phrases from a stored collection of source phrases, each of the set of source phrases including at least one token of the initial source text as a prefix and at least one other token as a suffix, the selection of the set of source phrases being based on a translatability score, wherein the translatability score, for each of a stored set of source phrases, is a function of statistics of at least one biphrase in which the source phrase occurs in combination with a corresponding target phrase in the target language, each biphrase being one of a collection of biphrases;
proposing a set of candidate phrases for display on the interface, the candidate phases each comprising the suffix of a respective one of the source phrases in the set of source phrases;
providing for receiving a user'"'"'s selection of one of the candidate phrases in the set of candidate phrases and, where one of the candidate phrases in the set of candidate phrases is selected by the user, appending the selected one of the candidate phrases to the source text; and
optionally, repeating the receiving, selecting, proposing and providing to generate the text string in the source language, wherein in the repeating, the received text comprises the initial source text and previously appended text.
1 Assignment
0 Petitions
Accused Products
Abstract
An authoring method includes generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language. Initial source text entered by the user is received through the authoring interface. Source phrases are selected that each include at least one token of the initial source text as a prefix and at least one other token as a suffix. The source phrase selection is based on a translatability score and optionally on fluency and semantic relatedness scores. A set of candidate phrases is proposed for display on the authoring interface, each of the candidate phases being the suffix of a respective one of the selected source phrases. The user may select one of the candidate phrases, which is appended to the source text following its corresponding prefix, or may enter alternative text. The process may be repeated until the user is satisfied with the source text and the SMT model can then be used for its translation.
-
Citations
23 Claims
-
1. An authoring method comprising:
-
generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language; receiving source text comprising initial text entered by a user through the authoring interface in the source language; with a processor, selecting a set of source phrases from a stored collection of source phrases, each of the set of source phrases including at least one token of the initial source text as a prefix and at least one other token as a suffix, the selection of the set of source phrases being based on a translatability score, wherein the translatability score, for each of a stored set of source phrases, is a function of statistics of at least one biphrase in which the source phrase occurs in combination with a corresponding target phrase in the target language, each biphrase being one of a collection of biphrases; proposing a set of candidate phrases for display on the interface, the candidate phases each comprising the suffix of a respective one of the source phrases in the set of source phrases; providing for receiving a user'"'"'s selection of one of the candidate phrases in the set of candidate phrases and, where one of the candidate phrases in the set of candidate phrases is selected by the user, appending the selected one of the candidate phrases to the source text; and optionally, repeating the receiving, selecting, proposing and providing to generate the text string in the source language, wherein in the repeating, the received text comprises the initial source text and previously appended text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. An authoring system comprising:
-
an interface generator for generating an authoring interface on an associated display device for assisting a user to author a text string in a source language for translation to a target string in a target language; a suggestion system for receiving source text comprising initial text entered by a user through the authoring interface in the source language and for selecting a set of source phrases from a stored collection of source phrases, each of the set of source phrases including at least one token of the initial source text as a prefix and at least one other token as a suffix, the selection of the set being based on a translatability score which, for each of a stored set of source phrases, is a function of statistics of at least one biphrase of a collection of biphrases, in which biphrase the source phrase occurs in combination with a corresponding target phrase in the target language, the suggestion system proposing a set of candidate phrases for display on the generated interface, the proposed candidate phases each comprising the suffix of a respective one of the source phrases in the set of source phrases; the authoring interface being configured for receiving a user'"'"'s selection of one of the candidate phrases in the set and, where one of the candidate phrases in the set is selected by the user, for appending the selected one of the candidate phrases to the source text; the suggestion system being configured generating the text string in the source language from the initial source text and appended text; and a processor which implements the interface generator and suggestion system. - View Dependent Claims (17)
-
-
18. A method for training an authoring system comprising:
-
acquiring a collection of source phrases in a source language from a set of biphrases derived from a parallel corpus of sentences in the source language and a target language; with a processor, for each of the source phrases, computing a translatability score as a function of biphrase statistics derived from the parallel corpus for at least one of the biphrases in the set of biphrases for which the source phrase is a source phrase of the biphrase and storing the translatability scores in memory; and storing a scoring function for scoring each of a set of candidate phrases to be presented to an author for appending to initial source text during authoring of a source string in the source language, each of the candidate phrases, in combination with a portion of the source text, forming one of the source phrases in the collection, the scoring function scoring candidate phrases based on the translatability score, enabling the author to be presented with candidate phrases ranked according to scores output by the scoring function through an authoring interface. - View Dependent Claims (19, 20, 21, 22, 23)
-
Specification