Intelligent data munging
First Claim
1. An intelligent data munging system comprising:
- a data loader, executed by at least one hardware processor, toascertain data that is to be transformed, anddetermine, based on an analysis of the ascertained data, a sample of the ascertained data, wherein the sample of the ascertained data is less than the ascertained data;
a data iterator, executed by the at least one hardware processor, to enrich the sample of the ascertained data;
a data transformer, executed by the at least one hardware processor, todetermine features of the enriched sample of the ascertained data, anddetermine, based on the features of the enriched sample of the ascertained data, at least one transformation to be applied to the enriched sample of the ascertained data to transform the enriched sample of the ascertained data from a first format to a second format, wherein the at least one transformation is determined from a plurality of available transformations;
a data munging validator, executed by the at least one hardware processor, to validate the at least one determined transformation by analyzing a similarity of a transformation applied to previously transformed data; and
a script generator, executed by the at least one hardware processor, to generate, based on the validation of the at least one determined transformation, at least one script that is to be applied to the ascertained data to transform the ascertained data from the first format to the second format.
1 Assignment
0 Petitions
Accused Products
Abstract
According to examples, intelligent data munging may include ascertaining data that is to be transformed, and determining, based on an analysis of the ascertained data, a sample of the ascertained data. Intelligent data munging may further include enriching the sample of the ascertained data, determining features of the enriched sample of the ascertained data, and determining, based on the features, a transformation to be applied to the enriched sample of the ascertained data to transform the enriched sample of the ascertained data from a first format to a second format. Further, intelligent data munging may include validating the determined transformation, and generating, based on the validation of the determined transformation, a script that is to be applied to the ascertained data to transform the ascertained data from the first format to the second format.
-
Citations
20 Claims
-
1. An intelligent data munging system comprising:
-
a data loader, executed by at least one hardware processor, to ascertain data that is to be transformed, and determine, based on an analysis of the ascertained data, a sample of the ascertained data, wherein the sample of the ascertained data is less than the ascertained data; a data iterator, executed by the at least one hardware processor, to enrich the sample of the ascertained data; a data transformer, executed by the at least one hardware processor, to determine features of the enriched sample of the ascertained data, and determine, based on the features of the enriched sample of the ascertained data, at least one transformation to be applied to the enriched sample of the ascertained data to transform the enriched sample of the ascertained data from a first format to a second format, wherein the at least one transformation is determined from a plurality of available transformations; a data munging validator, executed by the at least one hardware processor, to validate the at least one determined transformation by analyzing a similarity of a transformation applied to previously transformed data; and a script generator, executed by the at least one hardware processor, to generate, based on the validation of the at least one determined transformation, at least one script that is to be applied to the ascertained data to transform the ascertained data from the first format to the second format. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for intelligent data munging, the method comprising:
-
ascertaining, by at least one hardware processor, data that is to be transformed; determining, by the at least one hardware processor, based on an analysis of the ascertained data, a sample of the ascertained data, wherein the sample of the ascertained data is less than the ascertained data; enriching, by the at least one hardware processor, the sample of the ascertained data; determining, by the at least one hardware processor, features of the enriched sample of the ascertained data; determining, by the at least one hardware processor, based on the features of the enriched sample of the ascertained data, at least one transformation to be applied to the enriched sample of the ascertained data to transform the enriched sample of the ascertained data from a first format to a second format, wherein the at least one transformation is determined from a plurality of available transformations; validating, by the at least one hardware processor, the at least one determined transformation by analyzing a similarity of a transformation applied to previously transformed data; and generating, by the at least one hardware processor, based on the validation of the at least one determined transformation, at least one script that is to be applied to the ascertained data to transform the ascertained data from the first format to the second format. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable medium having stored thereon machine readable instructions to provide intelligent data munging, the machine readable instructions, when executed, cause a processor to:
-
ascertain data that is to be transformed; determine, based on an analysis of the ascertained data, a sample of the ascertained data, wherein the sample of the ascertained data is less than the ascertained data; enrich the sample of the ascertained data; determine features of the enriched sample of the ascertained data; determine, based on the features of the enriched sample of the ascertained data, at least one transformation to be applied to the enriched sample of the ascertained data to transform the enriched sample of the ascertained data from a first format to a second format, wherein the at least one transformation is determined from a plurality of available transformations; validate the at least one determined transformation by analyzing a similarity of a transformation applied to previously transformed data; generate, based on the validation of the at least one determined transformation, at least one script that is to be applied to the ascertained data to transform the ascertained data from the first format to the second format; generate, based on application of the script to the ascertained data, a transformed version of the ascertained data; and identify, based on the transformation of the ascertained data, an anomaly in the ascertained data. - View Dependent Claims (18, 19, 20)
-
Specification