Updating a large dataset in an enterprise computer system
First Claim
1. A method for creating, by an application executing on a computer, an updated dataset from an existing dataset, the existing dataset comprising a plurality of existing records, each existing record comprising a plurality of fields, at least one field in the plurality of fields being a unique identifier field, the unique identifier field comprising a unique identifier, wherein the existing dataset comprises records of user interactions with mobile communication devices, the method comprising:
- adding, by a database management tool executing on a computer, a temporary field to each existing record in a dataset by adding a new column, the temporary field containing a first value defined as a default value for the new column, wherein the database management tool does not provide an application programming interface (API) function for updating individual fields of records, and wherein the database management tool supports fast access to records in a targeted advertising system;
copying a subset of the records in the dataset, by an application executing on a computer;
changing, by the application, at least one individual field in each of the copied subset of records to update the records;
changing, by the application, the temporary field to a second value different from the first value in the copied subset of records;
adding, by the database management tool, the subset of records to the dataset;
aggregating, by the database management tool, the dataset based on a selection criterion defined with reference to the temporary field, wherein aggregating comprises identifying conflicts that comprise records that each have a unique identifier that is the same, identifying a preferred record for each conflict based on the criterion and based on the temporary field values of the conflicting records, and deleting records for each conflict that are not preferred;
deleting, by the database management tool, the temporary field from the records in the dataset, thereby creating an updated dataset; and
periodically creating, by the application, an aggregated set of records based on the updated dataset, wherein each time the aggregated set of records is created, the aggregated set of records is saved as a partition of the dataset, each partition being saved for a multiple of a set period of time.
6 Assignments
0 Petitions
Accused Products
Abstract
A method of updating fields of records in a dataset mediated by a database management tool (DMT) that does not an API function for updating individual fields of records. The method comprises adding a temporary field to each record in a dataset by the DMT, copying a subset of the records in the dataset by an application that is not the DMT, changing at least one field in each of the copied subset of records by the application, changing the temporary field of the copied subset of records by the application. The method further comprises adding the subset of records to the dataset by the DMT, aggregating, by the DMT, the dataset based on a selection criterion defined with reference to the temporary field, wherein aggregating removes conflicts between records that have the same unique identifier based on the temporary field values of the conflicting records.
-
Citations
15 Claims
-
1. A method for creating, by an application executing on a computer, an updated dataset from an existing dataset, the existing dataset comprising a plurality of existing records, each existing record comprising a plurality of fields, at least one field in the plurality of fields being a unique identifier field, the unique identifier field comprising a unique identifier, wherein the existing dataset comprises records of user interactions with mobile communication devices, the method comprising:
-
adding, by a database management tool executing on a computer, a temporary field to each existing record in a dataset by adding a new column, the temporary field containing a first value defined as a default value for the new column, wherein the database management tool does not provide an application programming interface (API) function for updating individual fields of records, and wherein the database management tool supports fast access to records in a targeted advertising system; copying a subset of the records in the dataset, by an application executing on a computer; changing, by the application, at least one individual field in each of the copied subset of records to update the records; changing, by the application, the temporary field to a second value different from the first value in the copied subset of records; adding, by the database management tool, the subset of records to the dataset; aggregating, by the database management tool, the dataset based on a selection criterion defined with reference to the temporary field, wherein aggregating comprises identifying conflicts that comprise records that each have a unique identifier that is the same, identifying a preferred record for each conflict based on the criterion and based on the temporary field values of the conflicting records, and deleting records for each conflict that are not preferred; deleting, by the database management tool, the temporary field from the records in the dataset, thereby creating an updated dataset; and periodically creating, by the application, an aggregated set of records based on the updated dataset, wherein each time the aggregated set of records is created, the aggregated set of records is saved as a partition of the dataset, each partition being saved for a multiple of a set period of time. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for creating, by the application executing on the computer, a series of partitions of a dataset, the dataset comprising an existing partition, the existing partition comprising a plurality of existing records, each existing record comprising a plurality of fields, at least one field in the plurality of fields being a unique identifier field, the unique identifier field comprising a unique identifier, and a temporary field storing a temporal value, the method comprising:
-
adding, by a database management tool executing on a computer, the temporary field to each existing record in the dataset by adding a new column, the temporary field containing a first value defined as a default value for the new column, wherein the database management tool does not provide an application programming interface (API) function for updating individual fields of records; copying a subset of the records in the dataset, by an application executing on a computer; changing, by the application, at least one individual field in each of the copied subset of records to update the records; changing, by the application, the temporary field to a second value different from the first value in the copied subset of records; adding, by the database management tool, the subset of records to the dataset; aggregating, by the database management tool, the dataset based on a selection criterion defined with reference to the temporary field, wherein the aggregating creates a new partition comprising aggregated records, wherein the aggregating comprises identifying conflicts that comprise records that each have a unique identifier that is the same, identifying a preferred record for each conflict based on the criterion and based on the temporary field values of the conflicting records, and deleting records for each conflict that are not preferred; deleting, by the database management tool, the temporary field from the records in the dataset, thereby creating an updated dataset; and subsequent to creating the updated dataset, periodically creating additional partitions by using the updated dataset that was created. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A method for creating, by the application executing on a computer, an updated dataset from an existing dataset, the existing dataset comprising a plurality of existing records, each existing record comprising a plurality of fields, at least one field in the plurality of fields being a unique identifier field, and one field in the plurality of fields being a temporal field, the unique identifier field comprising a unique identifier, the method comprising:
-
copying a subset of the records in the existing dataset, by a database management tool executing on a computer; changing, by an application executing on a computer, at least one individual field in each of the copied subset of records to update the records; changing, by the application, the temporal field to a value of a current date and different from an initial value in the copied subset of records; adding, by the database management tool, the subset of records to the existing dataset; aggregating, by the database management tool, the existing dataset based on a selection criterion defined with reference to the temporal field that prefers conflicting records having a value of a later date value, wherein aggregating comprises identifying conflicts that comprise records that each have a unique identifier that is the same, identifying a preferred record for each conflict based on the criterion and based on the temporal field values of the conflicting records, and deleting records for each conflict that are not preferred, thereby creating an updated dataset; after aggregating, changing contents of temporal fields of at least some records in the updated dataset so that all the records in the updated dataset have a same content in temporal fields; and after a set period of time, creating, by the application executing on the computer, a new aggregated set of records by using updated records that have been created since a previous update, wherein each time the new aggregated set of records is created, the new aggregated set of records is saved as a partition of the existing dataset, each partition being saved for a multiple of the set period of time. - View Dependent Claims (13, 14, 15)
-
Specification