Machine translation output reranking
First Claim
1. A method performed by a computing system for selecting a preferred machine translation of a content item, comprising:
- classifying the content item based on one or more categories, the categories including at least the topic of the content item;
classifying a plurality of users based on one or more categories, the one or more categories including at least an interest in one or more topics;
generating multiple computer-generated translations of the content item in a specified target language using configurable parameters, the multiple translations forming a set of translations;
performing one or more iterations, each iteration comprising;
selecting multiple groups of users based on a mapping between the one or more content item categories and the one or more user categories;
submitting each translation in the set to one group of users;
receiving, from each user in the group, a translation score for the reviewed translation;
weighting the translation score received from each user by a user-importance factor calculated as a function of a deviation of the user'"'"'s scores from average scores for previous translations;
computing an aggregate score for each reviewed translation based on the weighted scores from all users in the reviewing group;
determining if the iteration has produced a preferred translation, the preferred translation being a translation from the set having an aggregate score above a predetermined threshold, or a translation having an aggregate score higher than the aggregate score for all other translations in the set by a predetermined threshold; and
repeating the iteration if no preferred translation has been produced; and
providing the preferred machine translation in response a subsequent request for a translation of the content item.
2 Assignments
0 Petitions
Accused Products
Abstract
Technology is disclosed to select a preferred machine translation from multiple machine translations of a content item, each machine translation from the multiple machine translations created for the same target language. Each machine translation is assigned a score based on feedback from a user group that receives the machine translation. The machine translation with the highest score is identified as the preferred machine translation, and is provided in response to subsequent requests for translations of the content item. If there is no preferred translation, the several top scoring machine translations are provided to a larger group of users for further scoring. This process may be repeated until either a clearly preferred translation is identified, a maximum number of iterations is reached, or a maximum number of scoring users is reached.
-
Citations
17 Claims
-
1. A method performed by a computing system for selecting a preferred machine translation of a content item, comprising:
-
classifying the content item based on one or more categories, the categories including at least the topic of the content item; classifying a plurality of users based on one or more categories, the one or more categories including at least an interest in one or more topics; generating multiple computer-generated translations of the content item in a specified target language using configurable parameters, the multiple translations forming a set of translations; performing one or more iterations, each iteration comprising; selecting multiple groups of users based on a mapping between the one or more content item categories and the one or more user categories; submitting each translation in the set to one group of users; receiving, from each user in the group, a translation score for the reviewed translation; weighting the translation score received from each user by a user-importance factor calculated as a function of a deviation of the user'"'"'s scores from average scores for previous translations; computing an aggregate score for each reviewed translation based on the weighted scores from all users in the reviewing group; determining if the iteration has produced a preferred translation, the preferred translation being a translation from the set having an aggregate score above a predetermined threshold, or a translation having an aggregate score higher than the aggregate score for all other translations in the set by a predetermined threshold; and repeating the iteration if no preferred translation has been produced; and providing the preferred machine translation in response a subsequent request for a translation of the content item. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 16)
-
-
9. A computer readable memory storing instructions configured to, when executed by a computing system, cause the computing system to perform operations for identifying a preferred machine translation of a content item, the operations comprising:
-
classifying the content item based on one or more categories, the categories including at least the topic of the content item; classifying a plurality of users based on one or more categories, the one or more categories including at least an interest in one or more topics; generating multiple computer-generated translations of the content item in a specified target language using configurable parameters, the multiple translations forming a set of translations; performing one or more iterations each iteration comprising; selecting multiple groups of users based on a mapping between the one or more content item categories and user categories; submitting each translation in the set to one group of users; receiving, from each user in the group, a translation score for the reviewed translation; weighting the translation score received from each user by a user-importance factor calculated as a function of a deviation of the user'"'"'s scores from average scores for previous translations; computing an aggregate score for each reviewed translation based on the weighted scores from all users in the reviewing group; determining if the iteration has produced a preferred translation, the preferred translation being a translation from the set having an aggregate score above a predetermined threshold, or a translation having an aggregate score higher than the aggregate score for all other translations in the group by a predetermined threshold; and repeating the iteration if no preferred translation has been produced; and providing the preferred machine translation in response a subsequent request for a translation of the content item. - View Dependent Claims (10, 11, 17)
-
-
12. A system for identifying a preferred machine translation of a content item, comprising:
-
a content item classification engine configured to classify the content item based on one or more categories, the categories including at least the topic of the content item; a user group defining engine configured to classify a plurality of users based on one or more categories, the one or more categories including at least an interest in one or more topics; a machine translation generation engine configured to generate multiple computer-generated translations of the content item-in a specified target language the multiple translations using configurable parameters and forming a set of translations; and a scoring engine configured to; perform one or more iterations each iteration comprising; selecting multiple groups of users based on a mapping between the one or more content item categories and the one or more user categories; submitting each translation in the set to one group of users; receiving, from each user in the group, a translation score for the reviewed translation; weighting the translation score received from each user by a user-importance factor calculated as a function of a deviation of the user'"'"'s scores from average scores for previous translations; computing an aggregate score for each reviewed translation, based on the weighted scores from all users in the reviewing group; determining if the iteration has produced a preferred translation, the preferred translation being a translation from the set having an aggregate score above a predetermined threshold, or translation having an aggregate score higher than the aggregate score for all other translations in the group by predetermined threshold; and repeating the iteration if no preferred translation has been produced providing the preferred machine translation in response a subsequent request for a translation of the content item. - View Dependent Claims (13, 14, 15)
-
Specification