×

Method and system for information extraction

  • US 20050131886A1
  • Filed: 01/11/2005
  • Published: 06/16/2005
  • Est. Priority Date: 06/22/2000
  • Status: Active Grant
First Claim
Patent Images

1. A method for extracting information from a natural language text corpus based on a natural language query, comprising the steps of:

  • analyzing said natural language text corpus with respect to surface structure of word tokens and surface syntactic roles of constituents;

    indexing and storing the analyzed natural language text corpus;

    analyzing a natural language query with respect to surface structure of word tokens and surface syntactic roles of constituents;

    creating a number of surface variants of the analyzed natural language query by replacing word tokens of said natural language query, and for at least one surface variant by rearranging word tokens of said natural language query, in such a way that said number of surface variants are equivalent to said natural language query with respect to lexical meaning of word tokens and surface syntactic roles of constituents;

    comparing said number of surface variants and said analyzed natural language query with the indexed and stored analyzed natural language text corpus; and

    extracting from said indexed and stored analyzed natural language text corpus, each portion of text comprising a string of word tokens that matches any one of said surface variants or said analyzed natural language query.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×