×

Method and system for extending keyword searching to syntactically and semantically annotated data

  • US 7,526,425 B2
  • Filed: 12/13/2004
  • Issued: 04/28/2009
  • Est. Priority Date: 08/14/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method in a computer system for preparing a corpus of documents for performing electronic searches, each document having at least one sentence, each sentence having a plurality of terms, comprising:

  • for each sentence of each document,parsing the sentence under the control of the computer system to generate a parse structure having a plurality of syntactic elements that correspond to the terms of the sentence;

    determining from the structure of the parse structure and the plurality of syntactic elements a corresponding grammatical role for each of a plurality of the terms of the sentence, each grammatical role being at least one of a subject, an object, a governing verb, a modifier, or a part of a prepositional phrase;

    normalizing the plurality of terms of the sentence having corresponding grammatical roles to a plurality of tagged terms, each tagged term indicating an association between the term of the sentence that corresponds to the grammatical role and an associated tag type that specifies the corresponding grammatical role, wherein at least one of the tagged terms has an associated tag type that specifies that the associated term of the sentence is a subject or an object of the sentence, wherein at least one of the tagged terms has an associated tag type that specifies that the associated term of the sentence is a modifier of another term of the sentence that has an associated tag type that specifies that the another term is a subject, object, or verb of the sentence, and wherein at least one of the tagged terms has an associated tag type that additionally specifies semantic information that refers to an entity type that identifies the associated term of the sentence as a type of person, location, or thing; and

    transforming each sentence to an enhanced data structure of terms stored as one or more inverted indexes of terms annotated with relationship information, wherein the plurality of the tagged terms are stored therein and indexed as additional terms of the sentence, each additional term including the term of the sentence and the associated tag type, thereby enabling a search engine to perform relationship searches by determining from the enhanced data structure whether a designated search term having an associated tag type that specifies a grammatical role and/or an entity type is present in the sentence in a same role, in a manner similar to the manner the search engine uses to determine whether a designated term is present in the sentence, at least one of the relationship searches capable of returning a plurality of relationships between at least two entities as a result of a single search specification.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×