Analytical database system that models data to speed up and simplify data analysis
First Claim
1. A method for providing an analytical business report that is based on data including data from tables, each table including a plurality of rows and a plurality of columns, each of the plurality of columns corresponding to a data field, the method comprising:
- inverting said data such that said data fields are stored in separate files that include a row number field and data values corresponding to a single data field, wherein said files include at least a first file with a partition field and a second file with an analytical field;
compressing at least one of said files, including files containing repeating data stored in successive rows;
traversing said at least one of said compressed files while said compressed file is stored in memory to directly retrieve data stored in said compressed file without decompressing said compressed file;
using said retrieved data to generate said analytical business report;
processing said partition field by creating sub-rowsets of said partition field and by assigning said sub-rowsets of said partition field to a first set of servers, wherein said first set of servers identify unique partition values contained in said sub-rowsets;
merging said unique partition values identified by said first set of servers;
transmitting said merged partition values to a second set of servers; and
processing said analytical field for said merged partition values by creating sub-rowsets of said analytical field and by assigning said sub-rowsets of said analytical field to said second set of servers.
1 Assignment
0 Petitions
Accused Products
Abstract
An analytical database system provides access to all of the data collected by an entity in interactive time. The analytical database system transforms relational database data. The relational database is denormalized and inverted such that the data fields of tables in the relational database are stored in separate files that contain a row number field and a single data field. At least one of the files, that contains repeating data values that are stored in successive rows, is compressed. The files include files with partition values and files with analytical data. Processing of the files is distributed by creating sub-rowsets of the files and by assigning the sub-rowsets to servers. The partial result sets are merged into a complete result set.
63 Citations
18 Claims
-
1. A method for providing an analytical business report that is based on data including data from tables, each table including a plurality of rows and a plurality of columns, each of the plurality of columns corresponding to a data field, the method comprising:
-
inverting said data such that said data fields are stored in separate files that include a row number field and data values corresponding to a single data field, wherein said files include at least a first file with a partition field and a second file with an analytical field;
compressing at least one of said files, including files containing repeating data stored in successive rows;
traversing said at least one of said compressed files while said compressed file is stored in memory to directly retrieve data stored in said compressed file without decompressing said compressed file;
using said retrieved data to generate said analytical business report;
processing said partition field by creating sub-rowsets of said partition field and by assigning said sub-rowsets of said partition field to a first set of servers, wherein said first set of servers identify unique partition values contained in said sub-rowsets;
merging said unique partition values identified by said first set of servers;
transmitting said merged partition values to a second set of servers; and
processing said analytical field for said merged partition values by creating sub-rowsets of said analytical field and by assigning said sub-rowsets of said analytical field to said second set of servers. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for providing an analytical business report that is based on data including data from tables, each table including a plurality of rows and a plurality of columns, each of the plurality of columns corresponding to a data field, comprising:
-
a data storage device that stores denormalized and inverted data in separate files that contain a row number field and data values corresponding to a single data field, wherein said files include at least a first file with a partition field and a second file with an analytical field;
a client computer that requests a business report that requires a calculation involving said first file and said second file;
an application controller that is responsive to said request from said client computer;
a plurality of servers connected to said data storage device and said application controller, wherein said application controller distributes portions of said calculation to said servers, merges partial result sets that are generated by said servers and transmits said merged result sets to said client computer, wherein said first file is compressed, wherein said first file contains repeating data stored in successive rows, and wherein said servers traverse said first compressed file while said first compressed file is stored in memory to directly retrieve data stored in said first compressed file without decompressing said first compressed file, wherein said retrieved data is used to generate said analytical business report;
wherein said application controller distributes processing of said first file by creating sub-rowsets of said first file and by assigning said sub-rowsets of said first file to a first set of servers, wherein said first set of servers identify unique partition values in said sub-rowsets of said first file and merges said unique partition values identified by said first set of servers and transmits said merged partition values to a second set of servers; and
wherein said second set of servers process sub-rowsets of said second file using said merged partition values to create partial result sets and wherein said second set of servers transmit said partial result sets to said application controller. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for providing an analytical business report that is based on data that includes tables each with a plurality of data fields, comprising the steps of:
-
inverting said data such that said data fields of said tables in said data are stored in files that contain a row number and a data field;
compressing at least one of said files;
distributing processing of said business report using a first set of severs; and
traversing said compressed file while said compressed file is stored in memory to directly retrieve data stored in said compressed file without decompressing said compressed file, wherein said retrieved data is used to generate said analytical business report;
wherein a first file includes a partition field and a second file includes an analytical field;
further comprising the steps of dividing rows of said first file into sub-rowsets;
assigning said sub-rowsets to said first set of servers to determine unique partition values;
merging said unique partition values generated by said first set of servers;
distributing a control structure related to said unique partition values to a second set said servers;
dividing rows of said second file into sub-rowsets;
assigning said sub-rowsets to said second set of servers; and
calculating partial result sets using said control structure, said analytical values and said unique partition values in said second set of servers. - View Dependent Claims (17, 18)
-
Specification