Methods and Systems for Compressing Indices
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for compressing indices are described. In one aspect, a plurality of items are selected where each item has an entry in an inverted index and each item entry comprises a listing of articles that the item appears in. At least a first item entry and a second item entry are determined for compression and the second item entry is compressed into the first item entry resulting in a compressed first item entry.
-
Citations
50 Claims
-
1-29. -29. (canceled)
-
30. A method implemented by a data processing system of a single computer or a network of computer processors, the method comprising:
-
selecting from an inverted index first and second entries, each of which includes an index identifying a concept, a plurality of document identifiers each identifying a document in which the concept identified by the index is expressed, and a plurality of concept values that each represents a strength of the expression of the concept identified by the index in a respective of the identified documents; determining, by the data processing system, a plurality of new concept values from the plurality of concept values in the first and second entries; and combining, by the data processing system, the first and second entries into a combined entry, the combined entry including a plurality of document identifiers each identifying a document in which at least one of the concepts identified by the indices of the first and second entries are expressed, and the plurality of new concept values determined from the plurality of concept values in the first and second entries. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37)
-
-
38. A system comprising:
-
a data processing system formed of a single computer or a network of computer processors; and an inverted index database stored on one or more data storage devices, the inverted index database comprising a first entry and a combined entry, wherein; the first entry comprises an entry index identifying a concept and a pointer to the combined entry; and the combined entry comprises a plurality of document identifiers each identifying a document and a plurality of concept values each associated with a corresponding of the document identifiers, wherein a strength at which the concept identified by the index of the first entry is expressed in a first of the documents differs from the concept value associated with the document identifier that identifies the first document. - View Dependent Claims (39, 40, 41, 42)
-
-
44. A system comprising:
-
a data processing system formed of a single computer or a network of computer processors; and an inverted index database stored on one or more data storage devices, the inverted index database comprising a first entry and a combined entry, wherein; the first entry comprises an index identifying a concept and a pointer to the combined entry; and the combined entry comprises a plurality of document identifiers each identifying a document, wherein the concept identified by the index of the first entry does not appear in at least one of the documents identified by the document identifiers included in the combined entry. - View Dependent Claims (43, 45, 46, 47, 48, 49, 50)
-
Specification