Data consistency and rollback for cloud analytics
First Claim
Patent Images
1. A system for retrieving consistent datasets, comprising:
- a staging database for storing batches of data corresponding to a period of time, wherein a batch of data includes one or more distinct datasets; and
a plurality of tenant devices, wherein each tenant device of the plurality of tenant devices includes a processor that executes instructions stored in memory to;
collect a current batch of data associated with a first period of time from one or more sources, wherein collection of the first batch of data includes instructions to collect new or changed data compared to a marked current batch of data that has been previously stored in memory,assign identification information for the collected current batch of data, wherein the collected current collected current batch of data includes at least one new or changed dataset compared to the marked current batch of data that is previously stored in a batch log,store the collected current batch of data in the staging database, wherein the stored current batch of data does not overwrite previously stored batches of data listed in the batch log, and wherein a location associated with the stored current batch of data is updated in the batch log,mark the collected current batch of data as the current batch of data in the batch log,detect a rollback event, wherein the rollback event indicates that the marked current batch of data should not be used,select a previously stored batch of data as the current batch using the batch log, wherein information for the previously stored batch of data is included in the batch log,retrieve the previously stored batch of data from the staging database using the identification information in the batch log, wherein the retrieved previously stored batch of data is used to overwrite the first batch of data, anddeleting information pertaining to the first batch of data from the batch log.
14 Assignments
0 Petitions
Accused Products
Abstract
An extract-transform-load (ETL) platform fetches consistent datasets in a batch for a given period of time and provides the ability to rollback that batch. The batch may be fetched for an interval of time, and the ETL platform may fetch new or changed data from different cloud/on-premise applications. It will store this data in the cloud or on-premise to build data history. As the ETL platform fetches new data, the system will not overwrite existing data, but rather will create new versions so that change history is preserved. For any reason, if businesses would like to rollback data, they could rollback to any previous batch.
-
Citations
8 Claims
-
1. A system for retrieving consistent datasets, comprising:
-
a staging database for storing batches of data corresponding to a period of time, wherein a batch of data includes one or more distinct datasets; and a plurality of tenant devices, wherein each tenant device of the plurality of tenant devices includes a processor that executes instructions stored in memory to; collect a current batch of data associated with a first period of time from one or more sources, wherein collection of the first batch of data includes instructions to collect new or changed data compared to a marked current batch of data that has been previously stored in memory, assign identification information for the collected current batch of data, wherein the collected current collected current batch of data includes at least one new or changed dataset compared to the marked current batch of data that is previously stored in a batch log, store the collected current batch of data in the staging database, wherein the stored current batch of data does not overwrite previously stored batches of data listed in the batch log, and wherein a location associated with the stored current batch of data is updated in the batch log, mark the collected current batch of data as the current batch of data in the batch log, detect a rollback event, wherein the rollback event indicates that the marked current batch of data should not be used, select a previously stored batch of data as the current batch using the batch log, wherein information for the previously stored batch of data is included in the batch log, retrieve the previously stored batch of data from the staging database using the identification information in the batch log, wherein the retrieved previously stored batch of data is used to overwrite the first batch of data, and deleting information pertaining to the first batch of data from the batch log. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification