×

TECHNIQUES FOR TRANSLITERATING INPUT TEXT FROM A FIRST CHARACTER SET TO A SECOND CHARACTER SET

  • US 20150088487A1
  • Filed: 02/28/2012
  • Published: 03/26/2015
  • Est. Priority Date: 02/28/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving, at a computing device having one or more processors, input text in a first character set;

    determining, at the computing device, a set of possible transliterations of the input text based on a plurality of mapping standards, each possible transliteration of the set of possible transliterations corresponding to a transliteration of the input text into a second character set corresponding to a target language, each mapping standard of the plurality of mapping standards defining a mapping of characters in the first character set to characters in the second character set, and each mapping standard having an associated transliteration probability, each transliteration probability being indicative of a likelihood that its corresponding mapping standard is appropriate for transliterating the input text to the second character set;

    determining a transliteration score for each of the possible transliterations based on the transliteration probabilities, the transliteration score being indicative of a likelihood that its corresponding possible transliteration is an accurate transliteration of the input text;

    determining, at the computing device, a set of candidate words in the target language based on the set of possible transliterations and a text corpus of the target language, wherein the set of candidate words includes words in the text corpus that match one of the set of possible transliterations, that are similar to one of the set of possible transliterations, and sound similar to one of the set of possible transliterations;

    determining, at the computing device, a likelihood score for each one of the set of candidate words based on a language model in the target language and one or more previous words received, each likelihood score being indicative of a probability that a corresponding candidate word corresponds to the input text;

    providing, from the computing device, one or more candidate words of the set of candidate words based on the likelihood scores;

    receiving a user selection indicating one of the candidate words;

    determining, at the computing device, a particular mapping standard of the plurality of mapping standards on which the selected candidate word was based; and

    adjusting, at the computing device, the transliteration probabilities based on the determination of the particular mapping standard.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×