×

Identification and Extraction of New Terms in Documents

  • US 20130246045A1
  • Filed: 03/14/2012
  • Published: 09/19/2013
  • Est. Priority Date: 03/14/2012
  • Status: Abandoned Application
First Claim
Patent Images

1. A method comprising:

  • parsing a document to obtain an n-gram phrase indicative of a new term, the phrase comprised of a plurality of words;

    breaking the n-gram phrase into a bi-gram phrase comprised of a first and a second phrase part, the first and second phrase part including at least one word;

    determining whether the first or second phrase part is in a vocabulary collection;

    estimating the probability that the bi-gram phrase should be in the vocabulary collection if it is not; and

    adding the bi-gram phrase to the vocabulary collection if the probability that the bi-gram phrase should be in the vocabulary collection exceeds a minimum threshold level.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×