Language model using reverse translations

US 10,460,040 B2
Filed: 06/27/2016
Issued: 10/29/2019
Est. Priority Date: 06/27/2016
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing a translation system, the translation system configured to generate a machine translation of source material from a source language into a destination language, the translation system being trained using destination language training data and comprising;

a translation model configured to receive the source material and generating one or more destination language hypotheses for the source material, anda language model configured to select one of the destination language hypotheses based on an analysis of the destination language training data;

analyzing supplemental destination language training data for training the language model, the supplemental destination language training data comprising one or more of;

monolingual destination language material that has been previously machine translated from the source language, ordestination language material for which translation into the source language has been previously requested; and

based on the analyzing, modifying the language model to account for the supplemental destination language training data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Exemplary embodiments relate to techniques for improving machine translation systems. The machine translation system may apply one or more models for translating material from a source language into a destination language. The models are initially trained using training data. According to exemplary embodiments, supplemental training data is used to train the models, where the supplemental training data uses in-domain material to improve the quality of output translations. In-domain data may include data that relates to the same or similar topics as those expected to be encountered in a translation of material from the source language into the destination language. In-domain data may include material previously translated from the source language into the destination language, material similar to previous translations, and destination language material that has previously been the subject of a request for translation into the source language.

9 Citations

View as Search Results

17 Claims

1. A method comprising:
- accessing a translation system, the translation system configured to generate a machine translation of source material from a source language into a destination language, the translation system being trained using destination language training data and comprising;
  
  a translation model configured to receive the source material and generating one or more destination language hypotheses for the source material, anda language model configured to select one of the destination language hypotheses based on an analysis of the destination language training data;
  
  analyzing supplemental destination language training data for training the language model, the supplemental destination language training data comprising one or more of;
  
  monolingual destination language material that has been previously machine translated from the source language, ordestination language material for which translation into the source language has been previously requested; and
  
  based on the analyzing, modifying the language model to account for the supplemental destination language training data.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the supplemental destination language training data comprises posts from a social network.
  - 3. The method of claim 1 wherein the translation model is configured to be trained using bilingual training data comprising material in the source language and material in the destination language, and the language model is configured to be trained using monolingual training data consisting of material in the destination language.
  - 4. The method of claim 1, wherein the supplemental destination language training data contains training material in one or more domains associated with the source language.
  - 5. The method of claim 1, wherein the supplemental destination language training data comprisesuntranslated destination language material that includes topics similar to topics found in translated destination language material.
  - 6. The method of claim 1, wherein:
    - the translation system applies a model selected from a plurality of models for translating the source material into the destination material;
      
      the plurality of models comprise;
      
      a first language model targeted to a first demographic group, anda second language model targeted to a second demographic group; and
      
      further comprising;
      
      analyzing demographic information of an originator of a request to translate the source material into the destination language;
      
      selecting the first language model or the second language model based on the demographic information; and
      
      applying the selected language model to translate the source material.

7. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to:
- access a translation system, the translation system configured to generate a machine translation of source material from a source language into a destination language, the translation system being trained using destination language training data and comprising;
  
  a translation model configured to receive the source material and generating one or more destination language hypotheses for the source material, anda language model configured to select one of the destination language hypotheses based on an analysis of the destination language training data;
  
  analyze supplemental destination language training data for training the language model, the supplemental destination language training data comprising one or more of;
  
  monolingual destination language material that has been previously machine translated from the source language, ordestination language material for which translation into the source language has been previously requested; and
  
  based on the analyzing, modify the language model to account for the supplemental destination language training data.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The medium of claim 7, wherein the supplemental destination language training data comprises posts from a social network.
  - 9. The medium of claim 7, wherein the translation model is configured to be trained using bilingual training data comprising material in the source language and material in the destination language, and the language model is configured to be trained using monolingual training data consisting of material in the destination language.
  - 10. The medium of claim 7, wherein the supplemental destination language training data contains training material in one or more domains associated with the source language.
  - 11. The medium of claim 7, wherein the supplemental destination language training data comprisesuntranslated destination language material that includes topics similar to topics found in translated destination language material.
  - 12. The medium of claim 7, wherein:
    - the translation system applies a model selected from a plurality of models for translating the source material into the destination material;
      
      the plurality of models comprise;
      
      a first language model targeted to a first demographic group, anda second language model targeted to a second demographic group; and
      
      further storing instructions for;
      
      analyzing demographic information of an originator of a request to translate the source material into the destination language;
      
      selecting the first language model or the second language model based on the demographic information; and
      
      applying the selected language model to translate the source material.

13. An apparatus comprising:
- a non-transitory computer-readable medium configured to store logic for implementing a translation system, the translation system configured to generate a machine translation of source material from a source language into a destination language, the translation system being trained using destination language training data and comprising;
  
  a translation model configured to receive the source material and generating one or more destination language hypotheses for the source material, anda language model configured to select one of the destination language hypotheses based on an analysis of the destination language training data;
  
  a processor configured to;
  
  analyze supplemental destination language training data for training the language model, the supplemental destination language training data comprising one or more of;
  
  monolingual destination language material that has been previously machine translated from the source language, ordestination language material for which translation into the source language has been previously requested; and
  
  based on the analyzing, modify the language model to account for the supplemental destination language training data.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The apparatus of claim 13, wherein the supplemental destination language training data comprises posts from a social network.
  - 15. The apparatus of claim 13, wherein the supplemental destination language training data contains training material in one or more domains associated with the source language.
  - 16. The apparatus of claim 13, wherein the supplemental destination language training data comprisesuntranslated destination language material that includes topics similar to topics found in translated destination language material.
  - 17. The apparatus of claim 13, wherein:
    - the translation system applies a model selected from a plurality of models for translating the source material into the destination material;
      
      the plurality of models comprise;
      
      a first language model targeted to a first demographic group, anda second language model targeted to a second demographic group; and
      
      the processor is further configured to;
      
      analyze demographic information of an originator of a request to translate the source material into the destination language;
      
      select the first language model or the second language model based on the demographic information; and
      
      apply the selected language model to translate the source material.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Original Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Inventors
Eck, Matthias Gerhard
Primary Examiner(s)
Tzeng, Feng-Tzer

Application Number

US15/194,249
Publication Number

US 20170371866A1
Time in Patent Office

1,219 Days
Field of Search

704 2, 704 3, 704 4, 704 5, 704235, 704246, 704270, 704275, 704277
US Class Current
CPC Class Codes

G06F 40/216   using statistical methods

G06F 40/40   Processing or translation o...

G06F 40/44   Statistical methods, e.g. p...

G06F 40/49   using very large corpora, e...

Language model using reverse translations

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

9 Citations

17 Claims

Specification

Use Cases

Quick Links

Others

Language model using reverse translations

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

9 Citations

17 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others