×

Text search method and apparatus for structured documents

  • US 5,745,745 A
  • Filed: 06/27/1995
  • Issued: 04/28/1998
  • Est. Priority Date: 06/29/1994
  • Status: Expired due to Term
First Claim
Patent Images

1. A text search method for searching structured documents, each structured document having a plurality of logical structures, each logical structure being enclosed by predetermined markers and having a unique discriminator for identifying the logical structure, comprising the steps of:

  • preparing each of the structured documents to be searched, each structured document being identified by a document identifier, the preparing step for each structured document including;

    a) storing the structured document in a database;

    b) creating, and storing in the database, condensed texts and the corresponding unique discriminators for each of the respective logical structures, each of the condensed texts being formed by segmenting the structured document according to the predetermined markers and including all words within the respective logical structure without duplication; and

    c) creating, and storing in the database, a character occurrence bitmap including all characters without duplication in the structured document;

    inputting a search term and a discriminator identifying a logical structure;

    searching the character occurrence bitmap for each of the structured documents and identifying a plurality of first document identifiers corresponding to the character occurrence bitmaps indicated as including each character contained in the inputted search term; and

    searching the condensed texts corresponding to the input discriminator and identified by the first document identifiers and identifying a plurality of second document identifiers corresponding to the condensed texts which include each word contained in the inputted search term;

    wherein if the search term includes a plurality of words and indicates a positional relationship between the words, the method further comprises the step of;

    searching the structured documents identified by the second document identifiers and identifying a plurality of third document identifiers corresponding to the structured documents containing all the words having the positional relationship defined in the inputted search term.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×