Method and system for rapid retrieval in a full text indexing system
First Claim
Patent Images
1. A method for generating a full text index of a plurality of documents comprising:
- associating at least one group of word numbers with the words of an indexed document, the indexed document having an initial word number and a final word number associated therewith, the group having a predetermined group size greater than one, the initial word number associated with any indexed document being at least a group size distance from an initial word number associated with another indexed document;
ensuring that a minimum delta value exists between a final word number associated with one indexed document and an initial word number associated with another indexed document;
generating a word list having a plurality of word entries, each word entry comprising a word and a list of word numbers associated with the word; and
generating a cross-reference table containing a cross-reference entry for each group, each cross reference entry containing a reference to an indexed document with which the group has been associated.
9 Assignments
0 Petitions
Accused Products
Abstract
A method and system for generating and searching a full text index. The fill text index includes the use of word numbers and a minimum delta which minimizes the need to access document level information during the application of search operators. Word registers having coordinated document level and word level information, as well as relevance information are used in search operations. Word numbers are clustered together during sub-operations in preparation for the next operation in a search query. The fill text index according to the present invention is extremely efficient and greatly reduces table accesses and/or disk I/Os.
-
Citations
19 Claims
-
1. A method for generating a full text index of a plurality of documents comprising:
-
associating at least one group of word numbers with the words of an indexed document, the indexed document having an initial word number and a final word number associated therewith, the group having a predetermined group size greater than one, the initial word number associated with any indexed document being at least a group size distance from an initial word number associated with another indexed document;
ensuring that a minimum delta value exists between a final word number associated with one indexed document and an initial word number associated with another indexed document;
generating a word list having a plurality of word entries, each word entry comprising a word and a list of word numbers associated with the word; and
generating a cross-reference table containing a cross-reference entry for each group, each cross reference entry containing a reference to an indexed document with which the group has been associated. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for generating a full text index of a plurality of documents, comprising:
-
associating a word number with each word in a plurality of documents, each document having associated therewith an initial word number and a final word number, the initial word number associated with each document being at least a predetermined increment from an initial word number associated with another document, the predetermined increment being greater than one;
providing a minimum inter-document word number delta between a word number in one document and a word number in another document;
generating a word list containing words and the word numbers associated with the respective word; and
generating a document cross-reference table associating word numbers with a respective document. - View Dependent Claims (15, 16)
-
-
17. A full text index, comprising:
-
a word table being operative to contain a plurality of lists of word numbers, the word numbers being associated with words in a plurality of indexed documents, each indexed document having at least one group of word numbers associated therewith and having a first word number and a last word number, each group having a uniform size greater than one, and the first word number of any document being at least the uniform size away from the first word number of any other document, wherein a minimum delta exists between a final word number associated with an indexed document and an initial word number associated with any other indexed document; and
a cross-reference table comprising a reference entry corresponding to each group of word numbers and containing a reference to an indexed document to which the group of word numbers is associated. - View Dependent Claims (18, 19)
-
Specification