Method for generating a compressed index of information of records of a database
First Claim
1. A computer implemented method for generating a compressed index of information, the information represented by a plurality of information portions, the method comprising the steps of:
- encoding a representation of a first of the plurality of information portions; and
storing the encoded representation of the first of the plurality of information portions in the index of information;
wherein the encoded representation indicates a number of bytes that the first of the plurality of information portions has in common with a representation of a second of the plurality of information portions and bytes that differ between the first of the plurality of information portions and a representation of the second of the plurality of information portions.
11 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method for generating a compressed index of information. The information is stored as a plurality of records in a database. Indexable portions of information are sequentially parsed to generate words and metawords. The words represent the portions, and the metawords represent attributes of the portions. A location is sequentially assigned to each word and metaword in the order that the portions are parsed to form pairs. The pairs are sorted first according to the words and metawords, and second according to the locations. Index entries are written to a memory for each unique word and metaword. Each index entry includes a word entry or a metaword entry, and one or more location entries. The word and metaword entries use a prefix encoding which indicates the number of bytes that the unique word or metaword of a next index entry has in common with the unique word or metaword of a previous index entry. The location entries use a delta value encoding.
57 Citations
13 Claims
-
1. A computer implemented method for generating a compressed index of information, the information represented by a plurality of information portions, the method comprising the steps of:
-
encoding a representation of a first of the plurality of information portions; and storing the encoded representation of the first of the plurality of information portions in the index of information; wherein the encoded representation indicates a number of bytes that the first of the plurality of information portions has in common with a representation of a second of the plurality of information portions and bytes that differ between the first of the plurality of information portions and a representation of the second of the plurality of information portions. - View Dependent Claims (2, 3, 4)
-
-
5. A system for generating a compressed index of information, the information represented by a plurality of information portions, comprising:
-
a memory for storing the index; and a processor configured to; encode a representation of a first of the plurality of information portions; and store the encoded representation of the first of the plurality of information portions in the index of information; wherein the encoded representation indicates a number of bytes that the representation of the first of the plurality of information portions has in common with a representation of a second of the plurality of information portions and bytes that differ between the representation of the first of the plurality of information portions and a representation of the second of the plurality of information portions. - View Dependent Claims (6, 7, 8, 9)
-
-
10. An article of manufacture for generating a compressed index of information, the information represented by a plurality of information portions, the article of manufacture comprising:
-
at least one processor readable carrier; and instructions contained on the carrier; wherein the instructions are configured to be readable from the at least one carrier by one or more processors and thereby cause the one or more processors to operate so as to; encode a representation of a first of the plurality of information portions; and store the representation of the first of the plurality of information portions in the index of information; wherein the encoded representation indicates a number of bytes that the representation of the first of the plurality of information portions has in common with a representation of a second of the plurality of information portions and the bytes that differ between the representation of the first of the plurality of information portions and a representation of the second of the plurality of information portions. - View Dependent Claims (11, 12, 13)
-
Specification