×

Method and system for information extraction

  • US 6,842,730 B1
  • Filed: 06/23/2000
  • Issued: 01/11/2005
  • Est. Priority Date: 06/22/2000
  • Status: Active Grant
First Claim
Patent Images

1. A method of storing a natural language text corpus in a database, comprising the steps of:

  • identifying word tokens of said natural language text corpus;

    determining locations in the natural language text of the identified word tokens;

    determining word types associated with the identified word tokens;

    storing the determined word types in said database, wherein the number of stored word types is less than the number of identified word tokens;

    storing word token location identifiers identifying the determined locations in the natural language text corpus of the identified word tokens; and

    linking the stored word token location identifiers to the stored word types, such that, for a given identified word token, the stored word token location identifier identifying the location of the identified word token is logically linked to the stored word type associated with the identified word token.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×