Searching documents for ranges of numeric values
First Claim
1. A computer implemented method, comprising:
- accessing document identifiers for documents;
scanning the documents to determine values comprising searchable terms in the documents that are members of a set of values, wherein the set of values comprises at least one of an integer and a real number;
generating a number of posting lists, wherein each posting list is associated with a range of a number of consecutive values within the set of values, wherein the number of consecutive values is determined by dividing the number of values in the set of values by the number of positing lists, and wherein each posting list includes at least one of the document identifiers for at least one of the documents including at least one of the values within the range of consecutive values associated with the posting list; and
storing the generated posting lists, wherein the posting lists are used to process a query on a range of values within the set of values.
0 Assignments
0 Petitions
Accused Products
Abstract
Provided are a method, system, and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values.
-
Citations
13 Claims
-
1. A computer implemented method, comprising:
-
accessing document identifiers for documents; scanning the documents to determine values comprising searchable terms in the documents that are members of a set of values, wherein the set of values comprises at least one of an integer and a real number; generating a number of posting lists, wherein each posting list is associated with a range of a number of consecutive values within the set of values, wherein the number of consecutive values is determined by dividing the number of values in the set of values by the number of positing lists, and wherein each posting list includes at least one of the document identifiers for at least one of the documents including at least one of the values within the range of consecutive values associated with the posting list; and storing the generated posting lists, wherein the posting lists are used to process a query on a range of values within the set of values. - View Dependent Claims (2)
-
-
3. A computer implemented method, comprising:
-
accessing document identifiers for documents; scanning the documents to determine values comprising searchable terms in the documents that are members of a set of values, wherein the set of values comprises at least one of an integer and a real number; generating a number of posting lists associated with a first level, wherein each posting list is associated with a range of a number of consecutive values within the set of values, wherein the number of consecutive values is determined by dividing the number of values in the set of values by the number of positing lists, and wherein each posting list includes at least one of the document identifiers for at least one of the documents including at least one of the values within the range of consecutive values associated with the posting list; and performing at least one iteration of generating posting lists for an additional level, wherein each posting list generated for the additional level is formed by merging at least two posting lists associated with a previous level, wherein each generated posting list at one additional level is associated with consecutive values in the set of values, wherein each document in the generated posting list at the additional level includes one value in the consecutive values associated with the posting list at the additional level, and wherein a new additional level and posting lists associated therewith are generated with each iteration. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10)
-
-
11. A system, comprising:
-
a processor; and a computer readable medium including code executed by the processor to perform operations, the operations comprising; accessing document identifiers for documents; scanning the documents to determine values comprising searchable terms in the documents that are members of a set of values, wherein the set of values comprises at least one of an integer and a real number; generating a number of posting lists, wherein each posting list is associated with a range of a number of consecutive values within the set of values, wherein the number of consecutive values is determined by dividing the number of values in the set of values by the number of positing lists, and wherein each posting list includes at least one of the document identifiers for at least one of the documents including at least one of the values within the range of consecutive values associated with the posting list; and storing the generated posting lists, wherein the posting lists are used to process a query on a range of values within the set of values.
-
-
12. An article of manufacture comprising at least one of a hardware device implementing logic and a computer storage media having computer executable code to cause operations to be performed, the operations comprising:
-
accessing document identifiers for documents; scanning the documents to determine values comprising searchable terms in the documents that are members of a set of values, wherein the set of values comprises at least one of an integer and a real number; generating a number of posting lists, wherein each posting list is associated with a range of a number of consecutive values within the set of values, wherein the number of consecutive values is determined by dividing the number of values in the set of values by the number of positing lists, and wherein each posting list includes at least one of the document identifiers for at least one of the documents including at least one of the values within the range of consecutive values associated with the posting list; and storing the generated posting lists, wherein the posting lists are used to process a query on a range of values within the set of values. - View Dependent Claims (13)
-
Specification