User driven warehousing
First Claim
1. A non-transitory computer-readable storage medium storing one or more sequences of instructions for user driven warehousing, which when executed by one or more processors, causes:
- storing usage data identifying usage patterns of business intelligence applications, wherein the business intelligence applications access data tables, and wherein the usage data describes which data tables are being accessed along with the time and frequency of the accessing;
accessing metadata describing the data model of the business intelligence applications;
determining a refresh priority for the data tables, wherein the refresh priority is based on usage patterns of the data tables over a predetermined period of time; and
automatically scheduling Extract, Transform, Load (ETL) jobs related to the data tables based on the refresh priority and clustering of the usage of tables by time to ensure that the data tables are refreshed before a time of their anticipated use.
4 Assignments
0 Petitions
Accused Products
Abstract
Approaches for a user-driven warehousing approach are provided, wherein usage patterns for business intelligence applications are gathered, for example in an automated recording fashion, allowing the automated scheduling of jobs in a manner that prioritizes jobs that populate the most-used tables and scheduling those jobs in a manner to ensure that the data is up-to-date prior to when it is generally accessed. The usage pattern analysis also allows for the automated identification of more focused data marts for particular situations. The usage pattern analysis also provides for automated data warehouse/data mart creation and customization based on usage patterns that may be used as a seed, as well as for on-the-fly latitudinal analysis across prepackaged domain-specific applications.
-
Citations
24 Claims
-
1. A non-transitory computer-readable storage medium storing one or more sequences of instructions for user driven warehousing, which when executed by one or more processors, causes:
-
storing usage data identifying usage patterns of business intelligence applications, wherein the business intelligence applications access data tables, and wherein the usage data describes which data tables are being accessed along with the time and frequency of the accessing; accessing metadata describing the data model of the business intelligence applications; determining a refresh priority for the data tables, wherein the refresh priority is based on usage patterns of the data tables over a predetermined period of time; and automatically scheduling Extract, Transform, Load (ETL) jobs related to the data tables based on the refresh priority and clustering of the usage of tables by time to ensure that the data tables are refreshed before a time of their anticipated use. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for user driven warehousing, comprising:
-
one or more processors; and one or more non-transitory computer-readable storage mediums storing one or more sequences of instructions, which when executed by the one or more processors, cause; storing usage data identifying usage patterns of business intelligence applications, wherein the business intelligence applications access data tables, and wherein the usage data describes which data tables are being accessed along with the time and frequency of the accessing; accessing metadata describing the data model of the business intelligence applications; determining a refresh priority for the data tables, wherein the refresh priority is based on the usage patterns of the data tables over a predetermined period of time; and automatically scheduling Extract, Transform, Load (ETL) jobs related to the data tables based on the refresh priority and clustering of the usage of tables by time to ensure that the data tables are refreshed before a time of their anticipated use. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for user driven warehousing, comprising:
-
storing usage data identifying usage patterns of business intelligence applications, wherein the business intelligence applications access data tables, and wherein the usage data describes which data tables are being accessed along with the time and frequency of the accessing; accessing metadata describing the data model of the business intelligence applications; determining a refresh priority for the data tables, wherein the priority is based on usage patterns of the data tables over a predetermined period of time; and automatically scheduling Extract, Transform, Load (ETL) jobs related to the data tables based on the refresh priority and clustering of the usage of tables by time to ensure that the data tables are refreshed before a time of their anticipated use. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification