Partitioned database model to increase the scalability of an information system
First Claim
Patent Images
1. A computer implemented method for managing a database that is configured for use with a processing system, the processing system performing the method comprising:
- maintaining a structured organization of data over a plurality of partitions within the database, each one of the plurality of partitions having a size limit;
inserting new data into the structured organization of data;
automatically adding a new partition when the inserted new data results in a size of the one of the plurality of partitions meeting or exceeding the respective size limit;
assigning each one of the new partitions to respective processing resources of the processing system, where at least some of the partitions in the plurality of partitions are assigned to different processing resources; and
indexing the data within each one of the plurality of partitions,wherein the data of the structured organization is string data, the structured organization further including,for each partition,a string dataset including (a) a plurality of strings and (b) a table that includes (1) a plurality of first values, each of which indicate where a corresponding one of the plurality strings is located within the string dataset and (2) a plurality of second values, each of which indicate a length of the corresponding one of the plurality strings, anda plurality of index datasets, each one of the plurality of index datasets related to a word from the plurality of strings in the string dataset, each one of the plurality of index datasets storing (a) a record value that indicates a corresponding record in the table and (b) a position value that indicates where the word is located within the plurality strings,wherein the string dataset is a string file and the plurality of index datasets is a plurality of index files that each have a corresponding file name, where record value(s) and position value(s) for each one of the plurality index files are based on the file name of the corresponding file.
3 Assignments
0 Petitions
Accused Products
Abstract
A database includes data tables and indexes that are partitioned. Searches against the data table are performed in parallel over the multiple partitions. The indexes on each partition maintain indexes associated with the data on the given partition. Data tables storing string data include a string data file and index files for each word stored in the string data file.
24 Citations
19 Claims
-
1. A computer implemented method for managing a database that is configured for use with a processing system, the processing system performing the method comprising:
-
maintaining a structured organization of data over a plurality of partitions within the database, each one of the plurality of partitions having a size limit; inserting new data into the structured organization of data; automatically adding a new partition when the inserted new data results in a size of the one of the plurality of partitions meeting or exceeding the respective size limit; assigning each one of the new partitions to respective processing resources of the processing system, where at least some of the partitions in the plurality of partitions are assigned to different processing resources; and indexing the data within each one of the plurality of partitions, wherein the data of the structured organization is string data, the structured organization further including, for each partition, a string dataset including (a) a plurality of strings and (b) a table that includes (1) a plurality of first values, each of which indicate where a corresponding one of the plurality strings is located within the string dataset and (2) a plurality of second values, each of which indicate a length of the corresponding one of the plurality strings, and a plurality of index datasets, each one of the plurality of index datasets related to a word from the plurality of strings in the string dataset, each one of the plurality of index datasets storing (a) a record value that indicates a corresponding record in the table and (b) a position value that indicates where the word is located within the plurality strings, wherein the string dataset is a string file and the plurality of index datasets is a plurality of index files that each have a corresponding file name, where record value(s) and position value(s) for each one of the plurality index files are based on the file name of the corresponding file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A database-management system for managing a database, the database-management system comprising:
-
a processing system that includes plural different processing resources, the processing system configured to; maintain a structured organization of data over a plurality of partitions within the database, each one of the plurality of partitions having a size limit and being assigned to at least one of the plural different processing resources; insert new data into the structured organization of data; automatically add a new partition when the inserted new data results in a size of the one of the plurality of partitions meeting or exceeding the respective size limit; assign the new partition to at least one of the plural different processing resources; and maintain an index on each of the partitions for the data located on the associated partition, wherein the data of the structured organization is string data, the structured organization further including, for each partition, a string dataset that includes (a) a plurality of strings and (b) a table that includes (1) a plurality of first values, each of which indicate where a corresponding one of the plurality strings is within the string dataset and (2) a plurality of second values, each of which indicate a length of the corresponding one of the plurality strings, and wherein the index includes a plurality of index datasets, each one of the plurality of index datasets relating to a word in the plurality of strings, each one of the plurality of index datasets storing plural records that each comprise (a) a record index that indicates one of the plurality of first values in the table for the corresponding one of the plurality of strings and (b) a position index that indicates where the related word is located within the corresponding one of the plurality strings, wherein the string dataset is a string file and the plurality of index datasets is a plurality of index files that each have a corresponding file name, where the record and position index data of each file is based on the name of the corresponding file. - View Dependent Claims (18)
-
-
19. A non-transitory computer readable storage medium storing computer-readable instructions for performing a string search against a database system storing string data over a plurality of partitions, each one of the partitions including:
- 1) a string file that includes (a) at least some of the string data, and (b) a table that includes a plurality of first values, each of which indicate where a corresponding one string of the string data is within the string file and a plurality of second values, each of which indicate a length of the corresponding one string, and
2) a plurality of index files that are each associated with a word or words within the string file of a corresponding partition, each one of the plurality of index files storing at least one record that comprises a reference indicator that indicates one of the plurality first values, and a position index that indicates where the related word is located within the corresponding one string, the database system including at least one processor, the stored instructions comprising instructions configured to;execute a search, in parallel, over the multiple partitions, that is related to at least one word within the string data; locate, on at least one of the multiple partitions, at least one of the plurality of index files related to the at least one word; read at least one reference indicator and at least one corresponding position index from the located at least one of the plurality of index files; and retrieve, on the at least one of the multiple partitions, at least one string and/or word from the string file based on the read reference indicator and the at least one corresponding position index, wherein at least some of the multiple partitions are assigned to different processing resources of the database system where the executed search over the multiple partitions and the retrieval of the at least one string and/or word are accomplished using the assigned processing resources of the respective partition, wherein the plurality of index files each have a corresponding file name where the at least one record of a corresponding one of the plurality of index files includes reference indicator(s) and position index(es) for an indexed word that is based on the corresponding file name.
- 1) a string file that includes (a) at least some of the string data, and (b) a table that includes a plurality of first values, each of which indicate where a corresponding one string of the string data is within the string file and a plurality of second values, each of which indicate a length of the corresponding one string, and
Specification