×

Character and word level language models for out-of-vocabulary text input

  • US 9,047,268 B2
  • Filed: 02/22/2013
  • Issued: 06/02/2015
  • Est. Priority Date: 01/31/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by a computing device, an indication of user-inputted text;

    storing, by the computing device, a lexicon that includes a set of in-lexicon candidate strings and does not include a set of out-of-lexicon candidate strings, wherein each of the in-lexicon candidate strings and each of the out-of-lexicon candidate strings is a respective word in a particular language;

    for each respective in-lexicon candidate string of the set of in-lexicon candidate strings, determining, by the computing device, a respective score for the respective in-lexicon candidate string, wherein;

    the respective score is based at least in part on a probability of the respective in-lexicon candidate string as a whole being entered, andthe probability of the respective in-lexicon candidate string being entered is affected by a word-level context of the respective in-lexicon candidate string that includes one or more character strings that precede the respective in-lexicon candidate string in the user-inputted text;

    for each respective out-of-lexicon candidate string of the set of out-of-lexicon candidate strings, determining, by the computing device, a respective score for the respective out-of-lexicon candidate string, wherein the respective score for the respective out-of-lexicon candidate string is based at least in part on respective probabilities of individual characters in the respective out-of-lexicon candidate string being entered and is not based on word-level probabilities;

    determining, by the computing device and based at least in part on the scores for the in-lexicon candidate strings and the scores for the out-of-lexicon candidate strings, a combined set of candidate strings from the set of in-lexicon candidate strings and the set of out-of-lexicon candidate strings, the combined set of candidate strings including at least one in-lexicon candidate string from the set of in-lexicon candidate strings and at least one out-of-lexicon candidate string from the set of out-of-lexicon candidate strings;

    outputting, by the computing device, at least a portion of the combined set of candidate strings for display; and

    responsive to an indication of a selection of a candidate string from the combined set of candidate strings, outputting, by the computing device, for display in place of the user-inputted text, the selected candidate string.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×