Method for Increasing the Accuracy of Subject-Specific Statistical Machine Translation (SMT)
0 Assignments
0 Petitions
Accused Products
Abstract
A method of improving the accuracy of the translation output of Statistical Machine Translation (SMT), while increasing the effectiveness of an ongoing professional human translation effort by correlating the ongoing professional human translation effort directly with the translation errors made by the system. Once the translation errors have been corrected by professional human translators and are re-input to the system, the SMT'"'"'s training process may ensure that the same, and possibly similar, translation error(s) may not occur again.
227 Citations
20 Claims
-
1-10. -10. (canceled)
-
11. A method for determining whether a sentence has been translated correctly by a Statistical Machine Translation (SMT) system, said sentence translation correctness determination being for sentences that relate to a specific subject and which are designated for translation utilizing a specific SMT subject-specific domain, and for effecting the ongoing incremental improvement of the accuracy of SMT sentence translation of said sentences that relate to a specific subject and which are designated for translation utilizing a specific SMT subject-specific domain, the method comprising:
-
sending a user interface, from the SMT system to a user system, the user interface having an option that is available to the user for entering a user-defined threshold value;
the SMT system including at least one machine having a processor system having at least one processor and having a memory system;receiving, at the SMT system, input determining the user-defined threshold value; allowing, by the SMT system, the user to modify the user-defined threshold value prior to and after each translation; sending a user interface, from the SMT system to a user system, the user interface having an option that is available to the user to specify a subject-specific domain to be utilized for SMT sentence translation;
the SMT system including at least one machine having a processor system having at least one processor and having a memory system;receiving, at the SMT system, input determining the user specified subject-specific domain; allowing, by the SMT system, the user to modify the user specified subject-specific domain prior to and after each translation; after the SMT system has produced a translation of a single sentence, determining, by the SMT system, a probability that each possible translation of each word of the sentence is correct; for each word of the sentence determining, by the SMT system, which possible translation has a probability that the translation is correct that is a highest value compared to other possible translations of the word; and after the SMT has translated the single sentence, for each word of the sentence, comparing, by the processor system, the highest value to the user-defined threshold value to determine whether the highest value is either equal to, or higher than, the threshold value, and if the highest value relating to each word in the sentence is either equal to or higher than the user defined threshold value, presenting a translation of the sentence as a correct translation, otherwise the sentence is determined to have been translated incorrectly; effecting the ongoing incremental improvement of the accuracy of SMT sentence translation of sentences that relate to a specific subject and which are designated for translation utilizing a specific SMT subject-specific domain by way of (1)—
the user entering a user-defined threshold value for SMT translation by a specific subject-specific domain(2)—
submitting to SMT individual sentences, the subject of said sentences relating directly to the subject of the specific subject-specific domain, for translation, one sentence at a time(3)—
if SMT determined that the sentence submitted for translation was translated incorrectly, sending the incorrectly translated sentence to a human translator for translation(4)—
receiving from the human translator a translation of the sentence that was incorrectly translated, therein creating a correctly translated parallel corpus source and target language sentences(5)—
inputting the correctly translated parallel corpus source and target language sentences into a training system for the SMT subject-specific domain, so that the same translation error will not occur againthe continuing and repeated incremental increase of the user-defined threshold value by the user for SMT translation by the subject-specific domain at times that the user determines that there is a sustained and measurable decrease in the percentage of incorrectly translated sentences, and the subsequent repetition of steps #s 2 through 5 above until the desired level translation accuracy relating to sentences translated utilizing the subject-specific domain has been achieved. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification