×

Word Detection

  • US 20090055168A1
  • Filed: 08/23/2007
  • Published: 02/26/2009
  • Est. Priority Date: 08/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • determining first word frequencies for existing words and a candidate word in a training corpus, the candidate word defined by a sequence of constituent words, each constituent word being an existing word in a dictionary;

    determining second word frequencies for the constituent words and the candidate word in a development corpus;

    determining a candidate word entropy-related measure based on the second word frequency of the candidate word and the first word frequencies of the constituent words and the candidate word;

    determining an existing word entropy-related measure based on the second word frequencies of the constituent words and the first word frequencies of the constituent words and the candidate word; and

    determining that the candidate word is a new word if the candidate word entropy-related measure exceeds the existing word entropy-related measure.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×