×

Word detection

  • US 7,917,355 B2
  • Filed: 08/23/2007
  • Issued: 03/29/2011
  • Est. Priority Date: 08/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • determining, by one or more computers, first word frequencies for existing words and a candidate word in a training corpus, the candidate word defined by a sequence of constituent words, each constituent word being an existing word in a dictionary;

    determining, by the one or more computers, second word frequencies for the constituent words and the candidate word in a development corpus;

    determining, by the one or more computers, a candidate word entropy-related measure based on the second word frequency of the candidate word and the first word frequencies of the constituent words and the candidate word;

    determining, by the one or more computers, an existing word entropy-related measure based on the second word frequencies of the constituent words and the first word frequencies of the constituent words and the candidate word; and

    determining, by the one or more computers, that the candidate word is a new word if the candidate word entropy-related measure exceeds the existing word entropy-related measure.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×