×

Phrase matching in documents having nested-structure arbitrary (document-specific) markup

  • US 7,356,528 B1
  • Filed: 01/27/2004
  • Issued: 04/08/2008
  • Est. Priority Date: 05/15/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of searching a document having nested-structure document-specific markup, the method comprising:

  • receiving a query that designates at least (A) a phrase to be matched in a phrase matching process, and (B) a selective designation of at least a tag or annotation that is to be ignored during the phrase matching process;

    deriving query-specific indices based on query-independent indices that were created specific to each document, wherein the step of deriving the query-specific indices includes forming at least one of a group including;

    an index of each word in the phrase to be matched by the phrase matching process;

    an index of context tags that may be found in the document; and

    an index of at least a tag or annotation to be ignored during the phrase matching process,wherein the query-independent indices were created by a method including;

    a) labeling elements in the document with intervals, wherein;

    a1) for markup tags, the intervals are defined in terms of a starting index number associated with an opening markup tag and an ending index number associated with a closing markup tag that corresponds to the opening markup tag, anda2) for single words, the intervals are defined in terms of a single index number associated with the word; and

    b) forming the query-independent indices so that they are configured to be used in the searching method by first receiving, for a word or tag in the document, a position in the document, and by then indicating that the word or tag is present or not present at that position,wherein the step of deriving the query-specific indices involves deriving the query-specific indices from the query-independent indices without rebuilding any of the query-independent indices; and

    carrying out the phrase matching process using the query-specific indices on the document having the nested-structure document-specific markup.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×