Machine translation system employing classifier
First Claim
1. A method comprising:
- translating, by a machine translation system, an input in a source language to an output in a destination language;
analyzing, by a classifier, information associated with at least one of the machine translation system or the output to determine that the output of the machine translation system comprises one or more target words or phrases, the classifier mathematically dividing target words or phrases and non-target words or phrases; and
automatically modifying the machine translation system so that the one or more target words or phrases are not automatically presented to an output device, wherein the modifying comprises modifying, by the classifier, to filter the bilingual training data that trains the machine translation system, the training data comprising pairs of words or phrases in the source language and the destination language, the training data filtered to remove pairs in which the destination word or phrase is a target word or phrase.
2 Assignments
0 Petitions
Accused Products
Abstract
Exemplary embodiments relate to detecting, removing, and/or replacing objectionable words and phrases in a machine-generated translation. A classifier identifies translations containing target words or phrases. The classifier may be applied to the output translation to remove target words and phrases from the translation, or to prevent target words and phrases from being automatically presented. Further, the classifier may be applied to a translation model to prevent the target words and phrases from appearing in the output translation. Still further, the classifier may be applied to training data so that the translation model is not trained using the target words of phrases. The classifier may remove target words or phrases only when the target words or phrases appear in the output translation but not the source language input data. The classifier may be provided as a standalone service, or may be employed in the context of a machine translation system.
27 Citations
18 Claims
-
1. A method comprising:
-
translating, by a machine translation system, an input in a source language to an output in a destination language; analyzing, by a classifier, information associated with at least one of the machine translation system or the output to determine that the output of the machine translation system comprises one or more target words or phrases, the classifier mathematically dividing target words or phrases and non-target words or phrases; and automatically modifying the machine translation system so that the one or more target words or phrases are not automatically presented to an output device, wherein the modifying comprises modifying, by the classifier, to filter the bilingual training data that trains the machine translation system, the training data comprising pairs of words or phrases in the source language and the destination language, the training data filtered to remove pairs in which the destination word or phrase is a target word or phrase. - View Dependent Claims (2, 3, 4, 5, 6, 16)
-
-
7. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to:
-
translating, by a machine translation system, an input in a source language to an output in a destination language; analyze, by a classifier, information associated with at least one of the machine translation system or the output to determine that the output of the machine translation system comprises one or more target words or phrases, the classifier mathematically dividing target words or phrases and non-target words or phrases; and automatically modify the machine translation system so that the one or more target words or phrases are not automatically presented to an output device, wherein the modifying comprises modifying, by the classifier, to filter a trained phrase table that the machine translation system uses to translate the source language into the destination language, the trained phrase table comprising pairs of words or phrases in the source language and the destination language, the trained phrase table filtered to remove pairs in which the destination word or phrase is a target word or phrase. - View Dependent Claims (8, 9, 10, 11, 17)
-
-
12. An apparatus comprising:
-
a non-transitory computer readable medium storing logic for a machine translation system configured to translate an input in a source language to an output in a destination language; a classifier configured to analyze information associated with at least one of the machine translation system or the output to determine that the output of the machine translation system comprises one or more target words or phrases, the classifier mathematically dividing target words or phrases and non-target words or phrases; and a processor configured to automatically modify, by classifier, the machine translation system so that the one or more target words or phrases are not automatically presented to an output device, wherein the modifying comprises;
filtering, by classifier, bilingual training data that trains the machine translation system or filtering, by classifier, a trained phrase table that the machine translation system uses to translate the source language into the destination language, the filtering removing references to target words or phrases. - View Dependent Claims (13, 14, 15, 18)
-
Specification