Identifying risky translations

US 10,318,640 B2
Filed: 06/24/2016
Issued: 06/11/2019
Est. Priority Date: 06/24/2016
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a request at a translation system to translate source material from a source language into a destination language;

translating the source material into destination material, the translating comprising;

providing the source material to a translation model,identifying, with the translation model, a plurality of hypotheses representing translations of a source word or phrase into a destination word or phrase, the hypotheses comprising a first hypothesis and a second hypothesis different from the first hypothesis,outputting, from the translation model, a translation model score or rank for each of the plurality of hypotheses, wherein the first hypothesis has a higher translation model score or rank than the second hypothesis,evaluating, with a language model distinct from the translation model, the plurality of hypotheses in a context of the translated destination material,outputting, from the language model, a language model score or rank for each of the hypotheses, wherein the second hypothesis has a higher language model score or rank than the first hypothesis,combining, for each of the plurality of hypotheses, the translation model score or rank and the language model score or rank to select the first hypothesis or the second hypothesis as a most likely translation while identifying that the most likely translation is a questionable word or phrase based on the first hypothesis having a higher translation model score or rank and the second hypothesis having a higher language model score or rank; and

displaying the destination material, the displaying comprising visually distinguishing the questionable words or phrases in the destination material by differentiating between the most-likely translation and other words or phrases in the destination material that were translated with high confidence.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Exemplary embodiments provide techniques for evaluating when words or phrases of a translation were generated with a low degree of confidence, and conveying this information when the translation is presented. For example, if a source language word is encountered in source material for translation, but the source language word was only encountered a few times (or not at all) in the training data used to train the translation system, then the resulting translation may be flagged as being of low confidence. Other situations, such as the generation of two equally-likely translations, or translation system model disagreement, may also indicate a questionable translation. When the translation is displayed, questionable words and phrases may be flagged, and possible alternative translations may be presented. If one of the alternatives is selected, this information may be used to update the translation system'"'"'s models in order to improve translation quality in the future.

55 Citations

View as Search Results

18 Claims

1. A method comprising:
- receiving a request at a translation system to translate source material from a source language into a destination language;
  
  translating the source material into destination material, the translating comprising;
  
  providing the source material to a translation model,identifying, with the translation model, a plurality of hypotheses representing translations of a source word or phrase into a destination word or phrase, the hypotheses comprising a first hypothesis and a second hypothesis different from the first hypothesis,outputting, from the translation model, a translation model score or rank for each of the plurality of hypotheses, wherein the first hypothesis has a higher translation model score or rank than the second hypothesis,evaluating, with a language model distinct from the translation model, the plurality of hypotheses in a context of the translated destination material,outputting, from the language model, a language model score or rank for each of the hypotheses, wherein the second hypothesis has a higher language model score or rank than the first hypothesis,combining, for each of the plurality of hypotheses, the translation model score or rank and the language model score or rank to select the first hypothesis or the second hypothesis as a most likely translation while identifying that the most likely translation is a questionable word or phrase based on the first hypothesis having a higher translation model score or rank and the second hypothesis having a higher language model score or rank; and
  
  displaying the destination material, the displaying comprising visually distinguishing the questionable words or phrases in the destination material by differentiating between the most-likely translation and other words or phrases in the destination material that were translated with high confidence.
- View Dependent Claims (2, 3, 4, 5, 6, 18)
- - 2. The method of claim 1, wherein identifying the questionable words or phrases comprises determining whether the questionable words or phrases were encountered less than a predetermined threshold number of times in training data used to train the translation system.
  - 3. The method of claim 1, wherein identifying the questionable words or phrases comprises:
    - encountering a source word or phrase in the source material during the translating;
      
      generating two or more hypotheses representing possible translations of the source word or phrase into the destination language;
      
      determining respective probabilities of the two or more hypotheses;
      
      selecting a most-likely hypothesis; and
      
      determining that the probability associated with the most-likely hypothesis is within a predetermined amount of a probability associated with another hypothesis.
  - 4. The method of claim 1, further comprising:
    - identifying one or more non-questionable words or phrases in the destination material that were translated with high confidence; and
      
      visually distinguishing the non-questionable words or phrases in the destination material.
  - 5. The method of claim 1, further comprising:
    - presenting one or more alternatives for the questionable words or phrases.
  - 6. The method of claim 5, further comprising:
    - receiving a selection of one of the alternatives; and
      
      updating one or more models applied by the translation system based on the selected alternative.
  - 18. The method of claim 1, wherein:
    - the translation model score is based on the relative frequency in which the source word or phrase was translated into the respective hypothesis in bilingual training data used to train the translation model; and
      
      the language model score is based on the relative frequency in which the respective hypothesis was found in monolingual training data used to train the language model in a context corresponding to the context of the translated destination material.

7. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to:
- receive a request at a translation system to translate source material from a source language into a destination language;
  
  translate the source material into destination material, the translating comprising;
  
  providing the source material to a translation model,identifying, with the translation model, a plurality of hypotheses representing translations of a source word or phrase into a destination word or phrase, the hypotheses comprising a first hypothesis and a second hypothesis different from the first hypothesis,outputting, from the translation model, a translation model score or rank for each of the plurality of hypotheses, wherein the first hypothesis has a higher translation model score or rank than the second hypothesis,evaluating, with a language model distinct from the translation model, the plurality of hypotheses in a context of the translated destination material,outputting, from the language model, a language model score or rank for each of the hypotheses, wherein the second hypothesis has a higher language model score or rank than the first hypothesis,combining, for each of the plurality of hypotheses, the translation model score or rank and the language model score or rank to select the first hypothesis or the second hypothesis as a most likely translation while identifying that the most likely translation is a questionable word or phrase based on the first hypothesis having a higher translation model score or rank and the second hypothesis having a higher language model score or rank; and
  
  display the destination material, the displaying comprising visually distinguishing the questionable words or phrases in the destination material by differentiating between the most-likely translation and other words or phrases in the destination material that were translated with high confidence.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The medium of claim 7, comprising instructions to determine whether the questionable words or phrases were encountered less than a predetermined threshold number of times in training data used to train the translation system.
  - 9. The medium of claim 7, comprising instructions to:
    - encounter a source word or phrase in the source material during the translating;
      
      generate two or more hypotheses representing possible translations of the source word or phrase into the destination language;
      
      determine respective probabilities of the two or more hypotheses;
      
      select a most-likely hypothesis; and
      
      determine that the probability associated with the most-likely hypothesis is within a predetermined amount of a probability associated with another hypothesis.
  - 10. The medium of claim 7, comprising instructions to:
    - identify one or more non-questionable words or phrases in the destination material that were translated with high confidence; and
      
      visually distinguish the non-questionable words or phrases in the destination material.
  - 11. The medium of claim 7, comprising instructions to present one or more alternatives for the questionable words or phrases.
  - 12. The medium of claim 7, comprising instructions to:
    - receive a selection of one of the alternatives; and
      
      update one or more models applied by the translation system based on the selected alternative.

13. An apparatus comprising:
- a network adapter configured to receive a request to translate source material from a source language into a destination language;
  
  translation system logic, at least a portion of which is implemented in hardware, to implement a translation system, the translation system logic comprising logic to;
  
  translate the source material into destination material, the translating comprising;
  
  providing the source material to a translation model,identifying, with the translation model, a plurality of hypotheses representing translations of a source word or phrase into a destination word or phrase, the hypotheses comprising a first hypothesis and a second hypothesis different from the first hypothesis,outputting, from the translation model, a translation model score or rank for each of the plurality of hypotheses, wherein the first hypothesis has a higher translation model score or rank than the second hypothesis,evaluating, with a language model distinct from the translation model, the plurality of hypotheses in a context of the translated destination material,outputting, from the language model, a language model score or rank for each of the hypotheses, wherein the second hypothesis has a higher language model score or rank than the first hypothesis,combining, for each of the plurality of hypotheses, the translation model score or rank and the language model score or rank to select the first hypothesis or the second hypothesis as a most likely translation while identifying that the most likely translation is a questionable word or phrase based on the first hypothesis having a higher translation model score or rank and the second hypothesis having a higher language model score or rank; and
  
  display the destination material, the displaying comprising visually distinguishing the questionable words or phrases in the destination material by differentiating between the most-likely translation and other words or phrases in the destination material that were translated with high confidence.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The apparatus of claim 13, wherein the translation system logic is further configured to determine whether the questionable words or phrases were encountered less than a predetermined threshold number of times in training data used to train the translation system.
  - 15. The apparatus of claim 13, wherein the translation system logic is further configured to:
    - encounter a source word or phrase in the source material during the translating;
      
      generate two or more hypotheses representing possible translations of the source word or phrase into the destination language;
      
      determine respective probabilities of the two or more hypotheses;
      
      select a most-likely hypothesis; and
      
      determine that the probability associated with the most-likely hypothesis is within a predetermined amount of a probability associated with another hypothesis.
  - 16. The apparatus of claim 13, wherein the video adapter is further configured to:
    - present one or more alternatives for the questionable words or phrases.
  - 17. The apparatus of claim 16, wherein the video adapter is further configured to:
    - receive a selection of one of the alternatives; and
      
      update one or more models applied by the translation system based on the selected alternative.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Original Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Inventors
Hughes, William Arthur, Eck, Matthias Gerhard, Rottmann, Kay
Primary Examiner(s)
Thomas-Homescu, Anne L

Application Number

US15/192,076
Publication Number

US 20170371867A1
Time in Patent Office

1,082 Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/44 Statistical methods, e.g. p...

G06F 40/51 Translation evaluation

Identifying risky translations

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

55 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Identifying risky translations

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links