Method and system for translating information with a higher probability of a correct translation
First Claim
Patent Images
1. A method comprising:
- executing instructions stored in memory via a processor of a language engine for;
training the language engine to train for statistical machine translation;
training the language engine when to use statistical machine translation by applying a machine learning method to a bilingual text that has been annotated with the output of a non-statistical translation component along with information identifying the type of the translation component;
translating information from a first language to a second language using at least two translation components, wherein at least one translation component is a non-statistical translation component, each of the at least two translation components capable of translating equivalent phrases, each of the at least two translation components being selected based upon evaluation of an annotated training corpus, the annotated training corpus comprising substrings in the first language that have been annotated to associate the substrings with one or more translation components that are to be utilized to translate the substrings; and
automatically selecting a preferred component from the at least two translation components, the preferred component providing a translation having a highest probability of being correct.
2 Assignments
0 Petitions
Accused Products
Abstract
A system with a nonstatistical translation component integrated with a statistical translation component engine. The same corpus may be used for training the statistical engine and also for determining when to use the statistical engine and when to use the translation component. This training may use probabilistic techniques. Both the statistical engine and the translation components may be capable of translating the same information, however the system determines which component to use based on the training. Retraining can be carried out to add additional components, or when after additional translator training.
-
Citations
30 Claims
-
1. A method comprising:
executing instructions stored in memory via a processor of a language engine for; training the language engine to train for statistical machine translation; training the language engine when to use statistical machine translation by applying a machine learning method to a bilingual text that has been annotated with the output of a non-statistical translation component along with information identifying the type of the translation component; translating information from a first language to a second language using at least two translation components, wherein at least one translation component is a non-statistical translation component, each of the at least two translation components capable of translating equivalent phrases, each of the at least two translation components being selected based upon evaluation of an annotated training corpus, the annotated training corpus comprising substrings in the first language that have been annotated to associate the substrings with one or more translation components that are to be utilized to translate the substrings; and automatically selecting a preferred component from the at least two translation components, the preferred component providing a translation having a highest probability of being correct. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 25, 26, 27, 28, 29, 30)
-
13. A system comprising:
-
a memory for storing executable instructions; a processor for executing the instructions for training a language engine to train for statistical machine translation and training the language engine when to use statistical machine translation by applying a machine learning method to a bilingual text that has been annotated with the output of a non-statistical translation component along with information identifying the type of the translation component; at least two translating parts stored in memory and executable by the processor, wherein at least one translating part is a non-statistical translation part, each of the at least two translating parts operational to translate information from a first language to a second language and each of the at least two translating parts capable of translating equivalent phrases, each of the at least two translating parts being selected based upon evaluation of an annotated training corpus, the annotated training corpus comprising substrings in the first language that have been annotated to associate the substrings with one or more translating parts that are to be utilized to translate the substrings; and a classifier part stored in memory and executable by the processor to automatically select a preferred component from the at least two translating parts, the preferred component providing a translation having a highest probability of being correct. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A non-transitory computer readable storage medium having embodied thereon a program, the program executable by a processor to perform a method, the method comprising:
-
training a language engine to train for statistical machine translation;
training the language engine when to use statistical machine translation by applying a machine learning method to a bilingual text that has been annotated with the output of a non-statistical translation component along with information identifying the type of the translation component;translating information from a first language to a second language using at least two translation components, wherein at least one translation component is a non-statistical translation component, each of the at least two translation components capable of translating equivalent phrases, each of the at least two translation components being selected based upon evaluation of an annotated training corpus, the annotated training corpus comprising substrings in the first language that have been annotated to associate the substrings with one or more translation components that are to be utilized to translate the substrings; and automatically selecting a preferred component from the at least two translation components, the preferred component providing a translation having a highest probability of being correct.
-
Specification