Method and system for automatic management of reputation of translators
First Claim
1. A method for reducing processor time and memory during automated scoring of a translation using computation of a hybrid translation edit rate (HyTER) score calculation for a result word set and an exponentially sized reference set in a computing environment, the method comprising:
- receiving a translation hypothesis at a processor of the computing environment, the translation hypothesis comprising a result word set generated by a human or machine translation system in a target language, the result word set representing a translation of a test word set in a source language;
developing a search space for automated computation of the HyTER score, the search space comprising a lazy composition of;
a weighted finite-state acceptor (FSA) executable by the processor of the computing environment that represents a set of allowed permutations of the translation hypothesis and associated distance costs, the allowed permutations of the translation hypothesis constructed on demand according to local window constraints on movement of words within a fixed window size,the exponentially sized reference set of meaning equivalents encoded as a Recursive Transition Network stored in memory of the computing environment and expanded by the processor of the computing environment on demand, anda Levenshtein distance calculation between pairs of the search space comprising allowed permutations of the translation hypothesis and parts of the exponentially sized reference set that do not remain unexpanded, the calculation performed by the processor of the computing environment;
calculating using the processor of the computing environment the HyTER score for pairs in the search space to identify a pair in the search space having a minimum edit distance, and reducing the number of pairs for the composition for which the Levenshtein distance is calculated to save processor computation time and computer memory used for automated calculations of the HyTER score by constraining a number of paths constructed by the processor on demand by the weighted FSA using the fixed window size, and not constructing permutation paths of the composition outside the window; and
outputting the HyTER score for the human or machine translation system for the identified pair in the search space having a minimum edit distance, wherein a perfect score indicates that the result word set is an exact match of an acceptable translation in the exponentially sized reference set.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method that includes receiving a result word set in a target language representing a translation of a test word set in a source language. When the result word set is not in a set of acceptable translations, the method includes measuring a minimum number of edits to transform the result word set into a transform word set. The transform word set is in the set of acceptable translations. A system is provided that includes a receiver to receive a result word set and a counter to measure a minimum number of edits to transform the result word set into a transform word set. A method is provided that includes automatically determining a translation ability of a human translator based on a test result. The method also includes adjusting the translation ability of the human translator based on historical data of translations performed by the human translator.
650 Citations
22 Claims
-
1. A method for reducing processor time and memory during automated scoring of a translation using computation of a hybrid translation edit rate (HyTER) score calculation for a result word set and an exponentially sized reference set in a computing environment, the method comprising:
-
receiving a translation hypothesis at a processor of the computing environment, the translation hypothesis comprising a result word set generated by a human or machine translation system in a target language, the result word set representing a translation of a test word set in a source language; developing a search space for automated computation of the HyTER score, the search space comprising a lazy composition of; a weighted finite-state acceptor (FSA) executable by the processor of the computing environment that represents a set of allowed permutations of the translation hypothesis and associated distance costs, the allowed permutations of the translation hypothesis constructed on demand according to local window constraints on movement of words within a fixed window size, the exponentially sized reference set of meaning equivalents encoded as a Recursive Transition Network stored in memory of the computing environment and expanded by the processor of the computing environment on demand, and a Levenshtein distance calculation between pairs of the search space comprising allowed permutations of the translation hypothesis and parts of the exponentially sized reference set that do not remain unexpanded, the calculation performed by the processor of the computing environment; calculating using the processor of the computing environment the HyTER score for pairs in the search space to identify a pair in the search space having a minimum edit distance, and reducing the number of pairs for the composition for which the Levenshtein distance is calculated to save processor computation time and computer memory used for automated calculations of the HyTER score by constraining a number of paths constructed by the processor on demand by the weighted FSA using the fixed window size, and not constructing permutation paths of the composition outside the window; and outputting the HyTER score for the human or machine translation system for the identified pair in the search space having a minimum edit distance, wherein a perfect score indicates that the result word set is an exact match of an acceptable translation in the exponentially sized reference set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 16, 17, 18, 19, 20, 22)
-
-
9. A system, comprising:
-
a memory for storing executable instructions; and a processor for executing the instructions stored in the memory for developing a search space for automated computation of a hybrid translation edit rate (HyTER) score, the search space, the executable instructions comprising; a finite state acceptor executable by the processor to; receive a translation hypothesis comprising result word set generated by a human or machine translation system in a target language, the result word set representing a translation of a test word set in a source language; construct a set of allowed permutation paths of the translation hypothesis and associated distance costs, the allowed permutations of the translation hypothesis constructed on demand according to local window constraints on movement of words within a fixed window size; and output the HyTER score for the human or machine translation system for the identified pair in the search space having a minimum edit distance, wherein a perfect score indicates that the result word set is an exact match of an acceptable translation in an exponentially sized reference set; a reference recursive transition network executable by the processor to encode acceptable translations as an exponentially sized reference set of meaning equivalents encoded as a Recursive Transition Network stored in memory of a computing environment and expand the reference set on demand; a one state Levenshtein transducer executable by the processor to calculate a distance between pairs of the search space comprising allowed permutations of the translation hypothesis and the parts of the exponentially sized reference set that do not remain unexpanded, the calculation performed by the processor of the computing environment; a local window executable by the processor to constrain the movement of words by the finite state acceptor within a window of a fixed size; a calculator executable by the processor to calculate the HyTER score for pairs in the search space and identify a pair in the search space having a minimum edit distance, the number of pairs for a composition for which the Levenshtein distance is calculated being reduced by constraining a number of paths constructed by the processor on demand by a weighted FSA using the fixed window size, and not constructing permutation paths of the composition outside the window, saving processor computation time and computer memory used for automated calculation of the HyTER score. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
21. A non-transitory computer readable storage media having a program embodied thereon, the program being executable by a processor to perform a method for reducing processor time and memory during automated scoring of a translation using computation of a hybrid translation edit rate (HyTER) score calculation for a result word set and an exponentially sized reference set in a computing environment, the method comprising:
-
receiving a translation hypothesis at a processor of the computing environment, the translation hypothesis comprising a result word set generated by a human or machine translation system in a target language, the result word set representing a translation of a test word set in a source language; developing a search space for automated computation of the HyTER score, the search space comprising a lazy composition of; a weighted finite state acceptor (FSA) executable by the processor of the computing environment that represents a set of allowed permutations of the translation hypothesis and associated distance costs, the allowed permutations of the translation hypothesis constructed on demand according to local window constraints on movement of words within a fixed window size, the exponentially sized reference set of meaning equivalents encoded as a Recursive Transition Network stored in memory of the computing environment and expanded by the processor of the computing environment on demand, and a Levenshtein distance calculation between pairs of the search space comprising allowed permutations of the translation hypothesis and parts of the exponentially sized reference set that do not remain unexpanded, the calculation performed by the processor of the computing environment; calculating using the processor of the computing environment the HyTER score for pairs in the search space to identify a pair in the search space having a minimum edit distance, and reducing a number of pairs for the composition for which the Levenshtein distance is calculated to save processor computation time and computer memory used for automated calculations of the HyTER score by constraining a number of paths constructed by the processor on demand by the weighted FSA using the fixed window size, and not constructing permutation paths of the composition outside the window; and outputting the HyTER score for the human or machine translation system for the identified pair in the search space having a minimum edit distance, wherein a perfect score indicates that the result word set is an exact match of an acceptable translation in the exponentially sized reference set.
-
Specification