Method and system for providing business intelligence data

US 10,102,236 B2
Filed: 11/14/2012
Issued: 10/16/2018
Est. Priority Date: 11/15/2011
Status: Active Grant

First Claim

Patent Images

1. A system for improved efficiency in the retrieval of business intelligence data used in data mining, the system comprising:

an analytics server including a computer readable medium having a data file stored thereon, said data file consisting of source data aggregated from one or more data sources;

wherein said analytics server includes computer readable instructions stored on said computer readable medium for;

normalizing the source data to produce normalized data;

generating one or more dimensions from said source data, wherein said one or more dimensions define categories into which portions of said normalized data can be grouped in a snowflake schema;

generating one or more measures for each component used in producing an end product from said source data linked to said one or more dimensions in said snowflake schema;

said measures comprising rate measures, allocation measures, and hierarchical structure measures;

storing said one or more dimensions and said one or more measures in a plurality of tables arranged in a star schema;

determining relationship information between said one or more measures and said one or more dimensions in each of said plurality of tables;

filtering said plurality of tables to generate a plurality of independent fact tables and adding said independent fact tables to a fact table pool;

each of said plurality of independent fact tables selected from a category fact table, a time aggregated fact table, and a generalized fact table;

generating, from the normalized data, a master facts table containing data for two or more categories;

generating into the pool, a plurality of baby fact tables, each comprising a subset of the master facts table;

generating a plurality of cubes from the baby fact tables, the cubes aggregating data in the baby fact tables by at least one of the categories;

aggregating data in the baby fact tables by at least one of the categories;

receiving a query;

upon receiving the query, searching for the most specific baby fact table available in the pool to satisfy the query;

failing to find the most specific baby fact table;

upon said failing to find the most specific baby fact table, recording a miss, creating a new cube and recording the cube creation time for the new cube; and

based on said recording of the miss, pre-generating the most specific baby fact table for use in subsequent queries, wherein, the cube creation time using baby fact tables is smaller than the cube creation time using the master fact tables, thereby speeding up the generation of the cubes;

creating additional independent fact tables based on said pool statistics and adding said additional independent fact tables to said fact table pool;

determining a plurality of relationships between each component and said end product and storing said relationships in each of said plurality of tables;

each relationship comprises a cost relationship as a percentage of a total cost required to produce said end product;

storing said relationship information on said computer readable medium;

calculating a total cost of at least one product based on said cost relationship information;

one or more computing devices in communication with said analytics server, and including a module stored on a further computer readable medium having instructions thereon for;

submitting said at least one query and receiving data from said most specific fact table from said analytics server;

wherein said at least one query comprises querying for the change in total cost of said at least one product based on a change in any one of said measures;

and wherein a clustered index of each of said independent fact tables is cached in memory for faster query processing.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer implemented method for data mining and providing business intelligence data including generating by an analytics server one or more dimensions from source data imported from a computer readable medium, wherein the one or more dimensions define categories into which portions of the normalized data can be grouped; generating by the analytics server one or more measures from the source data linked to the one or more dimensions; storing by the analytics server the one or more dimensions and the one or more measures in a plurality of tables arranged in one of a snowflake and a star schema; determining by the analytics server relationship information between one or more measures and one or more dimensions in each of the plurality of tables; storing by the analytics server the relationship information on the computer readable medium; calculating by the analytics server a total cost of at least one product based on the relationship information; and, querying by a computer system in communication with the analytics server for the change in total cost of the at least one product based on a change in any one of the measures.

Citations

10 Claims

1. A system for improved efficiency in the retrieval of business intelligence data used in data mining, the system comprising:
- an analytics server including a computer readable medium having a data file stored thereon, said data file consisting of source data aggregated from one or more data sources;
  
  wherein said analytics server includes computer readable instructions stored on said computer readable medium for;
  
  normalizing the source data to produce normalized data;
  
  generating one or more dimensions from said source data, wherein said one or more dimensions define categories into which portions of said normalized data can be grouped in a snowflake schema;
  
  generating one or more measures for each component used in producing an end product from said source data linked to said one or more dimensions in said snowflake schema;
  
  said measures comprising rate measures, allocation measures, and hierarchical structure measures;
  
  storing said one or more dimensions and said one or more measures in a plurality of tables arranged in a star schema;
  
  determining relationship information between said one or more measures and said one or more dimensions in each of said plurality of tables;
  
  filtering said plurality of tables to generate a plurality of independent fact tables and adding said independent fact tables to a fact table pool;
  
  each of said plurality of independent fact tables selected from a category fact table, a time aggregated fact table, and a generalized fact table;
  
  generating, from the normalized data, a master facts table containing data for two or more categories;
  
  generating into the pool, a plurality of baby fact tables, each comprising a subset of the master facts table;
  
  generating a plurality of cubes from the baby fact tables, the cubes aggregating data in the baby fact tables by at least one of the categories;
  
  aggregating data in the baby fact tables by at least one of the categories;
  
  receiving a query;
  
  upon receiving the query, searching for the most specific baby fact table available in the pool to satisfy the query;
  
  failing to find the most specific baby fact table;
  
  upon said failing to find the most specific baby fact table, recording a miss, creating a new cube and recording the cube creation time for the new cube; and
  
  based on said recording of the miss, pre-generating the most specific baby fact table for use in subsequent queries, wherein, the cube creation time using baby fact tables is smaller than the cube creation time using the master fact tables, thereby speeding up the generation of the cubes;
  
  creating additional independent fact tables based on said pool statistics and adding said additional independent fact tables to said fact table pool;
  
  determining a plurality of relationships between each component and said end product and storing said relationships in each of said plurality of tables;
  
  each relationship comprises a cost relationship as a percentage of a total cost required to produce said end product;
  
  storing said relationship information on said computer readable medium;
  
  calculating a total cost of at least one product based on said cost relationship information;
  
  one or more computing devices in communication with said analytics server, and including a module stored on a further computer readable medium having instructions thereon for;
  
  submitting said at least one query and receiving data from said most specific fact table from said analytics server;
  
  wherein said at least one query comprises querying for the change in total cost of said at least one product based on a change in any one of said measures;
  
  and wherein a clustered index of each of said independent fact tables is cached in memory for faster query processing.
- View Dependent Claims (2, 3, 4, 5, 10)
- - 2. The system according to claim 1, wherein said step of determining relationship information comprises determining relationship information between any one of said one or more measures and an additional any one of said one or more measures selected from said one or more measures linked to the same dimension.
  - 3. The system according to claim 1, wherein said step of determining relationship information comprises determining relationship information between any one of said one or more measures and an additional any one of said one or more measures selected from one or more measures linked to a different dimension.
  - 4. The system according to claim 1, wherein said relationship information includes each said costs, and said computer readable instructions further include storing said costs in a costs table.
  - 5. The system according to claim 4, wherein said computer readable instructions further includes instructions for responding to a query by said one or more computing devices by determining a total product cost based on an identified change in one or more measures and said percentage of said total cost as stored in said costs table.
  - 10. The method according to claim 4, further comprising responding by said analytics server to a query by said one or more computing devices by determining a total product cost based on an identified change in one or more measures and said percentage of said total cost as stored in said costs table.

6. A computer implemented method for improving efficiency and retrieval of business intelligence data used in data mining comprising:
- normalizing the source data to produce normalized data;
  
  generating by an analytics server one or more dimensions from source data imported from a computer readable medium, wherein said one or more dimensions define categories into which portions of said normalized data can be grouped in a snowflake schema;
  
  generating by said analytics server one or more measures for each component used in producing an end product from said source data linked to said one or more dimensions in said snowflake schema;
  
  said measures comprising rate measures, allocation measures, and hierarchical structure measures;
  
  storing by said analytics server said one or more dimensions and said one or more measures in a plurality of tables arranged in a star schema;
  
  determining by said analytics server relationship information between said one or more measures and said one or more dimensions in each of said plurality of tables;
  
  filtering said plurality of tables to generate a plurality of independent fact tables and adding said independent fact tables to a fact table pool;
  
  each of said plurality of independent fact tables selected from a category fact table, a time aggregated fact table, and a generalized fact table;
  
  generating, from the normalized data, a master facts table containing data for two or more categories;
  
  generating into the pool, a plurality of baby fact tables, each comprising a subset of the master facts table;
  
  generating a plurality of cubes from the baby fact tables, the cubes aggregating data in the baby fact tables by at least one of the categories;
  
  aggregating data in the baby fact tables by at least one of the categories;
  
  receiving a query;
  
  upon receiving the query, searching for the most specific baby fact table available in the pool to satisfy the query;
  
  failing to find the most specific baby fact table;
  
  upon said failing to find the most specific baby fact table, recording a miss, creating a new cube and recording the cube creation time for the new cube; and
  
  based on said recording of the miss, pre-generating the most specific baby fact table for use in subsequent queries, wherein, the cube creation time using baby fact tables is smaller than the cube creation time using the master fact tables, thereby speeding up the generation of the cubes;
  
  creating additional independent fact tables based on said pool statistics and adding said additional independent fact tables to said fact table pool;
  
  determining a plurality of relationships between each component and said end product and storing said relationships in each of said plurality of tables;
  
  each relationship comprises a cost relationship as a percentage of a total cost required to produce said end product;
  
  storing by said analytics server said relationship information on said computer readable medium;
  
  calculating by said analytics server a total cost of at least one product based on said cost relationship information;
  
  submitting said at least one query by a computer system in communication with said analytics server for the change in total cost of said at least one product based on said most specific fact table; and
  
  caching a clustered index of each of said independent fact tables in memory for faster query processing.
- View Dependent Claims (7, 8, 9)
- - 7. The method according to claim 6, wherein said step of determining relationship information comprises determining relationship information between any one of said one or more measures and an additional any one of said one or more measures selected from said one or more measures linked to the same dimension.
  - 8. The method according to claim 6, wherein said step of determining relationship information comprises determining relationship information between any one of said one or more measures and an additional any one of said one or more measures selected from one or more measures linked to a different dimension.
  - 9. The method according to claim 6, wherein said relationship information includes each said costs, and said computer readable instructions further include storing said costs in a costs table.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
pVelocity, Inc.
Original Assignee
pVelocity, Inc.
Inventors
Yeung, Vivien, Lu, Kang, Lee, Michael, Parousis, Bill, Zhang, Keling
Primary Examiner(s)
Goldberg, Ivan R
Assistant Examiner(s)
Stewart, Crystol

Application Number

US13/676,633
Publication Number

US 20130124241A1
Time in Patent Office

2,162 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/2264   Multidimensional index stru...

G06F 16/2282   Tablespace storage structur...

G06F 16/2465   Query processing support fo...

G06F 16/283   Multi-dimensional databases...

G06F 16/285   Clustering or classification

G06F 16/951   Indexing; Web crawling tech...

G06Q 10/06   Resources, workflows, human...

G06Q 10/063   Operations research, analys...

Method and system for providing business intelligence data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for providing business intelligence data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links