×

Method for computerized information retrieval using shallow linguistic analysis

  • US 5,696,962 A
  • Filed: 05/08/1996
  • Issued: 12/09/1997
  • Est. Priority Date: 06/24/1993
  • Status: Expired due to Term
First Claim
Patent Images

1. In a system comprising a processor, a memory coupled to the processor, a user interface coupled to the processor, a primary query construction subsystem executed by the processor, a computerized information retrieval (IR) subsystem coupled to a text corpus, and a channel coupling the primary query construction subsystem and the information retrieval subsystem with one another, a method for retrieving documents from the text corpus in response to a user-supplied natural language input string comprising words, the method comprising the steps of:

  • with the user interface, accepting the input string into the primary query construction subsystem;

    with the primary query construction subsystem, analyzing the input string by performing a linguistic analysis of the input string to detect phrases therein, the detected phrases comprising words, each of the detected phrases comprising a grammatical construct identified in the linguistic analysis, at least one of the grammatical constructs identified in the linguistic analysis being a noun phrase comprising a plurality of words including a head word;

    with the primary query construction subsystem, constructing a series of queries based on the detected phrases, the queries of the series being constructed automatically by the primary query construction subsystem through a sequence of operations that comprises successive query broadening and query narrowing operations, each constructed query of the series comprising a collection of component queries, each component query being formed from a single one of the grammatical constructs identified in the linguistic analysis, each constructed query of the series having a first proximity constraint and a second proximity constraint, the first proximity constraint pertaining to a proximity relationship among words within a component query, the second proximity constraint pertaining to a proximity relationship among at least two component queries, at least one of the queries of the series comprising a component query based on all the words of the plurality;

    with the primary query construction subsystem, automatically constructing an additional query based on the head word of the noun phrase without the other words of the plurality;

    with the primary query construction subsystem, the information retrieval subsystem, the text corpus, and the channel, executing the queries of the series and the additional query to retrieve documents from the text corpus, the queries of the series being executed before the additional query; and

    with the primary query construction subsystem, ranking documents retrieved from the text corpus in response to one or more queries thus executed.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×