Process and method for data assurance management by applying data assurance metrics
First Claim
1. A data assurance management method comprising:
- selecting a plurality of data elements based on user requirements;
conducting a statistical random sampling of the plurality of data elements;
scoring, by one or more processors, the statistical random sampling to determine absolute and relative value of one or more data metrics, wherein the one or more data metrics are measures of data quality dimensions, wherein the data quality dimensions are characteristics of the plurality of data elements;
determining one or more frontier data points;
selecting an optimal data aggregation based on the one or more frontier data points;
applying the optimal data aggregation to the statistical random sample; and
rank ordering the aggregated data to create an output database from resultant data where at least a portion of less relevant data is eliminated in the output database.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates generally to methods, software and systems for measuring and valuing the quality of information and data, where such measurements and values are made and processed by implementing objectively defined, measurable, comparable and repeatable dimensions using software and complex computers. The embodiments include processes, systems and method for identifying optimal scores of the data dimension. The invention further includes processes, systems and method for data filtering to improve the overall data quality of a data source. Finally, the invention further includes processes, systems and method for data quality assurance of groups of rows of a database.
271 Citations
15 Claims
-
1. A data assurance management method comprising:
-
selecting a plurality of data elements based on user requirements; conducting a statistical random sampling of the plurality of data elements; scoring, by one or more processors, the statistical random sampling to determine absolute and relative value of one or more data metrics, wherein the one or more data metrics are measures of data quality dimensions, wherein the data quality dimensions are characteristics of the plurality of data elements; determining one or more frontier data points; selecting an optimal data aggregation based on the one or more frontier data points; applying the optimal data aggregation to the statistical random sample; and rank ordering the aggregated data to create an output database from resultant data where at least a portion of less relevant data is eliminated in the output database. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
Specification