SYSTEM AND METHOD FOR ASSESSING DATA ACCURACY
First Claim
1. A method for evaluating the accuracy of a data record, comprising:
- storing, at a data storage system, a plurality of data records received from a plurality of different data sources, wherein the data records have varying degrees of trustworthiness, and wherein each of the stored data record are associated with an entity;
receiving a data record based on a specified entity, the received data record to be evaluated for accuracy;
retrieving a group of data records from the stored plurality for data records, each retrieved data record associated with the specified entity, each retrieved data record having a data field corresponding to a data field of the data record to be evaluated, and each retrieved data record associated with one of the data sources; and
calculating, at a data accuracy system, an accuracy score for the data field of the data record to be evaluated, comprising;
determining a trustworthiness weight for at least each one of the retrieved group of data records based on the trustworthiness of the data source providing each one of the group of data records;
determining a matching score for at least the corresponding data field in each of the retrieved group of data records, based on the degree of similarity between (1) data in the corresponding data field in each of the retrieved group of data records and (2) data in the data field of the data record to be evaluated; and
combining, for the corresponding data field in each one of the retrieved group of data records, (1) the trustworthiness weight for that one of the retrieved group of records, with (2) the matching score for the corresponding data field in that one of retrieved group of data records; and
summing together the combined trustworthiness weight and matching score for every corresponding data field.
1 Assignment
0 Petitions
Accused Products
Abstract
Data from a plurality of data sources is provided to a multi-source data management system, which stores the data and provides it to a data accuracy system for purposes of assessing the accuracy of data records and the individual fields within data records. Data accuracy scores may be stored at the data management system with the data records to which they pertain. Accuracy scores may be periodically recalculated and monitored, and alerts provided if an accuracy score changes a predetermined amount over a given period of time. Also, data records may be provided by a data user for accuracy assessment, using other data records stored at the multi-sourced data management system.
116 Citations
27 Claims
-
1. A method for evaluating the accuracy of a data record, comprising:
-
storing, at a data storage system, a plurality of data records received from a plurality of different data sources, wherein the data records have varying degrees of trustworthiness, and wherein each of the stored data record are associated with an entity; receiving a data record based on a specified entity, the received data record to be evaluated for accuracy; retrieving a group of data records from the stored plurality for data records, each retrieved data record associated with the specified entity, each retrieved data record having a data field corresponding to a data field of the data record to be evaluated, and each retrieved data record associated with one of the data sources; and calculating, at a data accuracy system, an accuracy score for the data field of the data record to be evaluated, comprising; determining a trustworthiness weight for at least each one of the retrieved group of data records based on the trustworthiness of the data source providing each one of the group of data records; determining a matching score for at least the corresponding data field in each of the retrieved group of data records, based on the degree of similarity between (1) data in the corresponding data field in each of the retrieved group of data records and (2) data in the data field of the data record to be evaluated; and combining, for the corresponding data field in each one of the retrieved group of data records, (1) the trustworthiness weight for that one of the retrieved group of records, with (2) the matching score for the corresponding data field in that one of retrieved group of data records; and summing together the combined trustworthiness weight and matching score for every corresponding data field. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for evaluating the accuracy of a data record, comprising:
-
a data storage system for storing a plurality of data records received from a plurality of different data sources, wherein the data records have varying degrees of trustworthiness, and wherein each of the stored data records are associated with an entity; a data accuracy system including a processor and a memory, the memory storing instructions that are executable by the processor and configure the data accuracy system to; receive a data record associated with a specified entity, the received data record to be evaluated for accuracy; retrieve a group of data records from the stored plurality for data records stored at the data storage system, each retrieved data record associated with the specified entity, each retrieved data record having a data field corresponding to a data field of the data record to be evaluated, and each retrieved data record associated with one of the data sources; and calculate an accuracy score for the data field of the data record to be evaluated, comprising; determine a trustworthiness weight for at least each one of the retrieved group of data records based on the trustworthiness of the data source providing each one of the group of data records; determine a matching score for at least the corresponding data field in each of the retrieved group of data records, based on the degree of similarity between (1) data in the corresponding data field in each of the retrieved group of data records and (2) data in the data field of the data record to be evaluated; and combine, for the corresponding data field in each one of the retrieved group of data records, (1) the trustworthiness weight for that one of the retrieved group of records, with (2) the matching score for the corresponding data field in that one of retrieved group of data records; and sum together the combined trustworthiness weight and matching score for every corresponding data field. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A method for assessing the accuracy of a data record maintained at a data source, comprising:
-
receiving, from one data source, a data record to be assessed, the data record to be assessed associated with an entity and comprising a plurality of data fields; identifying other data records from one or more other data sources that are associated with the same entity as the entity associated data record to be assessed, each of the other data records having data fields corresponding to data fields in the data record to be assessed; assigning an accuracy score to at least one data field of the data record to be assessed, comprising; assigning a trustworthiness weight to each of the data sources; assigning a matching score for the degree of similarity between (1) the data in the at least one data field of the data record to be assessed and (2) the data in a corresponding data field in each of the data records; calculating the accuracy score (DAS) for the at least one data field of the data record to be assessed, based on;
-
Specification