Usage-based optimization of network traffic and data warehouse size
First Claim
1. A method for maintaining a data warehouse having a plurality of fields for storing data from one or more data sources, comprising:
- configuring a management component to manage updating the data warehouse on the basis of the usage characteristics indicating fields determined not to be in use, wherein the management component is configured to perform an operation, comprising;
monitoring queries issued against the data warehouse;
updating, based on the monitored queries, usage characteristics of one or more of the plurality of fields indicative of when the one or more fields were last involved in a query; and
selectively updating the data warehouse with data, from one or more of the data sources, for a limited subset of fields involved in the monitored queries within a first predetermined time period, as indicated by the usage characteristics;
wherein the configured management component performs the selective updating by differentiating between those fields in the data warehouse that are being queried within the first predetermined time period and those fields in the data warehouse that are not being queried within the first predetermined time period according to the usage characteristics.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention generally provides systems, methods, and articles of manufacture for maintaining a data warehouse having a plurality of fields updated with data from one or more data sources. Rather than automatically update every field of data available in the warehouse, a limited subset of fields identified through their involvement in queries issued against the warehouse are updated. By limiting the fields that are updated, the network bandwidth required to transmit the updates to the data warehouse may be reduced. Further, by removing fields from the data warehouse that are not in use, the size of the data warehouse may be reduced.
-
Citations
32 Claims
-
1. A method for maintaining a data warehouse having a plurality of fields for storing data from one or more data sources, comprising:
configuring a management component to manage updating the data warehouse on the basis of the usage characteristics indicating fields determined not to be in use, wherein the management component is configured to perform an operation, comprising; monitoring queries issued against the data warehouse; updating, based on the monitored queries, usage characteristics of one or more of the plurality of fields indicative of when the one or more fields were last involved in a query; and selectively updating the data warehouse with data, from one or more of the data sources, for a limited subset of fields involved in the monitored queries within a first predetermined time period, as indicated by the usage characteristics;
wherein the configured management component performs the selective updating by differentiating between those fields in the data warehouse that are being queried within the first predetermined time period and those fields in the data warehouse that are not being queried within the first predetermined time period according to the usage characteristics.- View Dependent Claims (2, 3, 4, 5, 6)
-
7. A method for maintaining a data warehouse, comprising:
-
receiving updates for a plurality of fields of the data warehouse, the updates comprising data from one or more data sources; monitoring queries issued against the data warehouse; identifying, based on the monitored queries, one or more fields of the data warehouse that have not been involved in queries for a predetermined time period; sending, by a management component, a request to the one or more data sources to discontinue receiving updates for the identified fields, on the basis of having determined that the identified fields are not being queried and do not require updates; and receiving updates for a limited subset of the plurality of fields of the data warehouse, the limited subset being exclusive of the identified fields included in the request. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-readable storage medium containing a program which, when executed by a processor, performs operations for maintaining a data warehouse having a plurality of fields for storing data from one or more data sources, the operations comprising:
configuring a management component to manage updating the data warehouse on the basis of the usage characteristics indicating fields determined not to be in use, wherein the management component is configured to perform an operation, comprising; monitoring queries issued against the data warehouse; updating, based on the monitored queries, usage characteristics of one or more of the plurality of fields indicative of when the one or more fields were last involved in a query; and selectively updating the data warehouse with data, from one or more of the data sources, for a limited subset of fields involved in the monitored queries within a first predetermined time period, as indicated by the usage characteristics;
wherein the configured management component performs the selective updating by differentiating between those fields in the data warehouse that are being queried within the first predetermined time period and those fields in the data warehouse that are not being queried within the first predetermined time period according to the usage characteristics.- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
22. A database system comprising:
-
a data warehouse comprising fields of data containing data originating from one or more data sources; one or more applications configured to issue queries against the data warehouse; and a warehouse manager configured to; monitor the queries issued against the data warehouse; update, based on the monitored queries, usage characteristics of the fields indicative of when the fields were last involved in a query; and selectively update the data warehouse with data, from one or more of the data sources, for a limited subset of fields involved in the monitored queries within a first predetermined time period, as indicated by the usage characteristics;
wherein the configured management component performs the selective updating by differentiating between those fields in the data warehouse that are being queried within the first predetermined time period and those fields in the data warehouse that are not being queried within the first predetermined time period according to the usage characteristics, whereby the warehouse manager limits the number of fields of the data warehouse that are updated with data from the one or more data sources based on usage of the fields in queries issued against the data warehouse. - View Dependent Claims (23, 24, 25)
-
-
26. A database system comprising:
-
a data warehouse comprising physical fields of data containing data originating from one or more data sources; a data repository abstraction component comprising logical fields mapped to corresponding physical fields of the data warehouse; one or more applications configured to issue abstract queries against the data warehouse, the abstract queries based on logical fields of the data repository abstraction component; and a warehouse manager configured to; monitor the involvement of the logical fields in abstract queries issued by the one or more applications; maintain usage characteristics indicative of when physical fields were last accessed in response to executing abstract queries containing the logical fields corresponding to the accessed physical fields; and limit the number of physical fields of the data warehouse that are updated with data from the one or more data sources to the physical fields corresponding to the logical fields last involved in abstract queries within a predetermined period of time, as indicated by the corresponding usage characteristics. - View Dependent Claims (27, 28, 29, 30)
-
-
31. A method for maintaining a data warehouse having a plurality of fields for storing data from one or more data sources, comprising:
-
monitoring queries issued against the data warehouse; updating, based on the monitored queries, usage characteristics of one or more of the plurality of fields in the data warehouse indicative of when the one or more fields were last involved in a query; issuing, by a management component, a fallout request to one or more of the data sources indicating fields determined not to be used according to the usage characteristics; and updating, by the management component, the data warehouse with data, from one or more of the data sources, for a limited subset of fields involved in the monitored queries within a first predetermined time period, as indicated by the fields contained in the fallout request.
-
-
32. A method for maintaining a data warehouse having a plurality of fields for storing data from one or more data sources, comprising:
configuring a management component to manage updating the data warehouse on the basis of usage characteristics indicating which of the plurality of fields are in use, wherein the management component is configured to perform an operation, comprising; monitoring queries issued against the data warehouse; updating, based on the monitored queries, the usage characteristics of one or more of the plurality of fields indicative of when the one or more fields were last involved in a query; receiving updates from the one or more data sources for the plurality of fields; identifying, from the received updates, a limited subset of fields involved in the monitored queries as distinguished from fields not involved in the monitored queries;
wherein the identifying is done on the basis of the usage characteristics; andselectively updating the data warehouse with data, from one or more of the data sources, for the identified limited subset of fields involved in the monitored queries.
Specification