Database controller, method, and program for handling range queries
First Claim
1. A database controller for a database of information encoded as a set of data items distributed among a plurality of storage units distinguishable from one another by respective identifying information and each data item comprising a value of each of two or more data elements, the set of data items having a prescribed order of the two or more data elements of which the first one or two data elements are assigned as a prefix portion, the database controller comprising computing hardware including a processor and a memory coupled to the processor and storing processing instructions which when executed by the processor cause the processor to execute a process, the process comprising:
- receiving a range query specifying a value of the or each of one or more data elements as a search prefix portion which each data item returned as a range query result must include in its prefix portion;
storing an ordered list comprising, for each of the data items from the set of data items individually, the identifying information of a storage unit upon which the data item is stored, the order of the identifying information being determined by an ordering metric applied to prefix portions of the respective data items;
maintaining a frequency record, recording values of the data elements assigned as prefix portions of the set of data items and the number of data items having each prefix portion, the frequency record being maintained in an order determined by applying the ordering metric to the prefix portion of the data items;
calculating, based on the frequency record, one or more list positions in the ordered list at which the identifying information of the plurality of storage units upon which data items including the search prefix portion in their prefix portion are stored, comprising calculating a start point of the one or more list positions by summing the numbers of data items having prefix portions preceding the search prefix portion in the order determined by the ordering metric, and calculating the end point of the one or more list positions by adding the number of data items having prefix portions including the search prefix portion to the start point; and
requesting the data items including the search prefix portion in their prefix portion from the plurality of storage units identified in the calculated one or more list positions, wherein the ordered list is periodically updated based on information received from the plurality of storage units, and, after each update, each occupied list position in the ordered list is succeeded by a predetermined number of vacant list positions; and
the process further comprises, when a new data item is added to the set of data items or a duplicate set, adding the identifying information of a storage unit upon which the new data item is stored to a vacant list position in the respective ordered list, the vacant list position being determined by applying the ordering metric to the prefix portion of the new data item in relation to the prefix portions of the respective occupied list positions by referring to the frequency record, such that the immediately preceding occupied list position stores the identifying information of a data item having a prefix preceding or equal to the prefix of the new data item according to the ordering metric, and the immediately following occupied list position stores the identifying information of a data item having a prefix following or equal to the prefix of the new data item according to the ordering metric.
1 Assignment
0 Petitions
Accused Products
Abstract
A database controller for a database of data items distributed across a plurality of storage units. The database controller configured to store an ordered list comprising, for each of the data items individually, the identifying information of the storage unit upon which the data item is stored. The database controller configured to maintain a frequency record, recording the values of prefix portions of the data items and the number of data items having each prefix portion. Wherein, the ordered list and the frequency record are ordered by applying the same ordering metric to the prefix portions. The database controller being configured to receive a range query specifying a search prefix portion, to use the frequency record to calculate where to find in the ordered list the identifying information of storage units storing the data items having the search prefix, and using that identifying information to retrieve query results.
9 Citations
12 Claims
-
1. A database controller for a database of information encoded as a set of data items distributed among a plurality of storage units distinguishable from one another by respective identifying information and each data item comprising a value of each of two or more data elements, the set of data items having a prescribed order of the two or more data elements of which the first one or two data elements are assigned as a prefix portion, the database controller comprising computing hardware including a processor and a memory coupled to the processor and storing processing instructions which when executed by the processor cause the processor to execute a process, the process comprising:
-
receiving a range query specifying a value of the or each of one or more data elements as a search prefix portion which each data item returned as a range query result must include in its prefix portion; storing an ordered list comprising, for each of the data items from the set of data items individually, the identifying information of a storage unit upon which the data item is stored, the order of the identifying information being determined by an ordering metric applied to prefix portions of the respective data items; maintaining a frequency record, recording values of the data elements assigned as prefix portions of the set of data items and the number of data items having each prefix portion, the frequency record being maintained in an order determined by applying the ordering metric to the prefix portion of the data items; calculating, based on the frequency record, one or more list positions in the ordered list at which the identifying information of the plurality of storage units upon which data items including the search prefix portion in their prefix portion are stored, comprising calculating a start point of the one or more list positions by summing the numbers of data items having prefix portions preceding the search prefix portion in the order determined by the ordering metric, and calculating the end point of the one or more list positions by adding the number of data items having prefix portions including the search prefix portion to the start point; and requesting the data items including the search prefix portion in their prefix portion from the plurality of storage units identified in the calculated one or more list positions, wherein the ordered list is periodically updated based on information received from the plurality of storage units, and, after each update, each occupied list position in the ordered list is succeeded by a predetermined number of vacant list positions; and the process further comprises, when a new data item is added to the set of data items or a duplicate set, adding the identifying information of a storage unit upon which the new data item is stored to a vacant list position in the respective ordered list, the vacant list position being determined by applying the ordering metric to the prefix portion of the new data item in relation to the prefix portions of the respective occupied list positions by referring to the frequency record, such that the immediately preceding occupied list position stores the identifying information of a data item having a prefix preceding or equal to the prefix of the new data item according to the ordering metric, and the immediately following occupied list position stores the identifying information of a data item having a prefix following or equal to the prefix of the new data item according to the ordering metric. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for obtaining range query results from a database of information encoded as a set of data items distributed among a plurality of storage units distinguishable from one another by respective identifying information and each data item comprising a value of each of three data elements, the set of data items having a prescribed order of the three data elements of which the first one or two data elements are assigned as a prefix portion, the plurality of storage units being connected to a database controller storing:
-
an ordered list comprising, for each of the data items from the set of data items individually, the identifying information of a storage unit upon which the data item is stored, the order of the identifying information being determined by an ordering metric applied to prefixes of the respective data items; and a frequency record, recording values of the data elements assigned as prefix portions of the set of data items and the number of data items having each prefix portion, wherein the frequency record is in an order determined by applying the ordering metric to the prefix portions of the data items; the method comprising; receiving a range query specifying a value of the or each of one or more data elements as a search prefix portion which each data item returned as a range query result must include in its prefix portion; calculating, based on a frequency record, the one or more list positions in the ordered list at which the storage unit IDs of the plurality of storage units upon which data items including the search prefix portion in their prefix portion are stored, comprising calculating a start point of the one or more list positions by summing the numbers of data items having prefix portions preceding the search prefix portion in the order determined by the ordering metric, and calculating the end point of the one or more list positions by adding the number of data items having prefix portions including the search prefix portion to the start point; and requesting data items including the search prefix portion in their prefix portion from the plurality of storage units identified in the calculated one or more list positions, wherein the ordered list is periodically updated based on information received from the plurality of storage units, and, after each update, each occupied list position in the ordered list is succeeded by a predetermined number of vacant list positions; and the method further comprises, when a new data item is added to the set of data items or a duplicate set, adding the identifying information of the storage unit upon which the new data item is stored to a vacant list position in the respective ordered list, the vacant list position being determined by applying the ordering metric to the prefix portion of the new data item in relation to the prefix portions of the respective occupied list positions by referring to the frequency record, such that the immediately preceding occupied list position stores the identifying information of a data item having a prefix preceding or equal to the prefix of the new data item according to the ordering metric, and the immediately following occupied list position stores the identifying information of a data item having a prefix following or equal to the prefix of the new data item according to the ordering metric.
-
-
12. A non-transitory storage medium storing a computer program which, when executed by one or more computing apparatuses, causes the one or more computing apparatuses to perform a method for obtaining range query results from a database of information encoded as a set of data items distributed among a plurality of storage units distinguishable from one another by respective identifying information and each data item comprising a value of each of three data elements, the set of data items having a prescribed order of the three data elements of which the first one or two data elements are assigned as a prefix portion, the plurality of storage units being connected to a database controller storing:
-
an ordered list comprising, for each of the data items from the set of data items individually, the identifying information of the storage unit upon which the data item is stored, the order of the identifying information being determined by an ordering metric applied to the prefixes of the respective data items; and a frequency record, recording the values of the data elements assigned as prefix portions of the set of data items and the number of data items having each prefix portion, wherein the frequency record is in an order determined by applying the ordering metric to the prefix portions of the data items; the method comprising; receiving a range query specifying a value of the or each of one or more data elements as a search prefix portion which each data item returned as a range query result must include in its prefix portion; calculating, based on a frequency record, the one or more list positions in the ordered list at which the storage unit IDs of the plurality of storage units upon which data items including the search prefix portion in their prefix portion are stored, comprising calculating a start point of the one or more list positions by summing the numbers of data items having prefix portions preceding the search prefix portion in the order determined by the ordering metric, and calculating the end point of the one or more list positions by adding the number of data items having prefix portions including the search prefix portion to the start point; and requesting data items including the search prefix portion in their prefix portion from the plurality of storage units identified in the calculated one or more list positions, wherein the ordered list is periodically updated based on information received from the plurality of storage units, and, after each update, each occupied list position in the ordered list is succeeded by a predetermined number of vacant list positions; and the method further comprises, when a new data item is added to the set of data items or a duplicate set, adding the identifying information of a storage unit upon which the new data item is stored to a vacant list position in the respective ordered list, the vacant list position being determined by applying the ordering metric to the prefix portion of the new data item in relation to the prefix portions of the respective occupied list positions by referring to the frequency record, such that the immediately preceding occupied list position stores the identifying information of a data item having a prefix preceding or equal to the prefix of the new data item according to the ordering metric, and the immediately following occupied list position stores the identifying information of a data item having a prefix following or equal to the prefix of the new data item according to the ordering metric.
-
Specification