Methods and systems for indexing references to documents of a database and for locating documents in the database
First Claim
1. A method for indexing references to documents of a database, the method comprising:
- receiving a document, at the database, from a server;
storing the document in the database;
extracting a searchable term from the document, the searchable term being associated with a posting list;
dividing the posting list into blocks, each block comprising M database references;
for each block;
determining an encoding pattern based on values of the M database references, the determining the encoding pattern comprises;
determining a number n of patches according to a number of references, among the M database references, that are greater than or equal to 2b; and
if n>
0;
calculating, for each of n patches, a patch value vk by deleting b least significant bits from a corresponding one of the M database references that are greater than or equal to 2b, wherein k is in a range from 1 to n, anddetermining, for each of the n patches, a patch position pk corresponding to a position, in a range of 0 to M−
1, of the corresponding one of the M database references that are greater than or equal to 2b;
wherein the encoding pattern comprises b, n, p1 . . . pn, v1 . . . vn;
locating an encoding pattern table entry corresponding to the encoding pattern;
inserting a pointer corresponding to the located encoding pattern table entry in a header for the block; and
inserting in the block a sequence of M truncated references, each truncated reference comprising b least significant bits of a corresponding one of the M database references.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems allow indexing references to documents of a database according to database reference profiles. Documents may then be located in the database using decoding protocols based on the database reference profiles. To this end, the documents are stored in the database and searchable terms extracted therefrom are associated with posting lists. Each posting list is divided into blocks of M database references. The blocks are encoded according to a pattern that depends on the M database references. A corresponding pointer to a table of encoding patterns is appended to each block. When a query is received for a searchable term, blocks are extracted from a posting list corresponding to the searchable term and a pointer for each block is used to extract a decoding protocol related to an encoding pattern for the block.
14 Citations
28 Claims
-
1. A method for indexing references to documents of a database, the method comprising:
-
receiving a document, at the database, from a server; storing the document in the database; extracting a searchable term from the document, the searchable term being associated with a posting list; dividing the posting list into blocks, each block comprising M database references; for each block; determining an encoding pattern based on values of the M database references, the determining the encoding pattern comprises; determining a number n of patches according to a number of references, among the M database references, that are greater than or equal to 2b; and if n>
0;calculating, for each of n patches, a patch value vk by deleting b least significant bits from a corresponding one of the M database references that are greater than or equal to 2b, wherein k is in a range from 1 to n, and determining, for each of the n patches, a patch position pk corresponding to a position, in a range of 0 to M−
1, of the corresponding one of the M database references that are greater than or equal to 2b;wherein the encoding pattern comprises b, n, p1 . . . pn, v1 . . . vn; locating an encoding pattern table entry corresponding to the encoding pattern; inserting a pointer corresponding to the located encoding pattern table entry in a header for the block; and inserting in the block a sequence of M truncated references, each truncated reference comprising b least significant bits of a corresponding one of the M database references. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for locating documents of a database that contain search terms, the method comprising:
-
receiving a search term, at the database, from a client, the search term being associated with a posting list, the posting list being arranged in blocks, each block comprising a header and M truncated references; reading a pointer from a header of a current block of the posting list; using the pointer to extract a decoding protocol from a decoding protocol table, wherein the decoding protocol defines an encoding pattern for the current block, the encoding pattern for the current block comprises; a base length b of M truncated references in the current block; a number n of patches in the current block; if n>
0, one or more patch values vk of the current block, wherein k is in a range from 1 to n;if n>
0, one or more patch positions pk in the current block, wherein pk is in a range of 0 to M−
1. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
Specification