×

System and method for productive generation of compound words in statistical machine translation

  • US 8,781,810 B2
  • Filed: 07/25/2011
  • Issued: 07/15/2014
  • Est. Priority Date: 07/25/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for making merging decisions for a translation comprising:

  • providing a translated text string in a target language of a source text string in a source language;

    with a merging system, wherein the merging system is implemented with a computer processor, outputting decisions on merging of pairs of words in the translated text string, the merging system comprising at least one of;

    a set of stored heuristics comprising at least a first heuristic by which two consecutive words in the string are considered for merging if an observed frequency f1 of the two consecutive words as a closed compound word is larger than an observed frequency f2 of the two consecutive words as a bigram, anda merging model trained on features associated with pairs of consecutive tokens of text strings in a training set and predetermined merging decisions for the pairs to predict merging decisions for a new translated text string; and

    outputting a translation in the target language based on the merging decisions for the translated text string.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×