Methods and systems for data cleaning
First Claim
Patent Images
1. A computer-implemented method for cleaning data stored in a database, the method comprising:
- providing a data fixing rule to capture an error, the data fixing rule incorporating;
a first set of attributes and respective first set of attribute values;
a second attribute and respective second set of attribute values, wherein the second set of attribute values are erroneous values; and
a correct value;
applying the data fixing rule to the database to capture the error, wherein the error is captured when the first set of attributes and attribute values, and the second attribute and at least one of the erroneous values of the second attribute match a record in the database; and
replacing the at least one erroneous value in the record with the correct value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for cleaning data stored in a database, the method comprising providing a set of fixing rules. Each fixing rule incorporates a set of attribute values that capture an error in a plurality of semantically related attribute values, and a deterministic correction which is operable to replace one of the set of attribute values with a correct attribute value to correct the error. The method further comprises comparing at least two of the fixing rules with one another to check that the error correction carried out by one fixing rule is consistent with the error correction carried out by another fixing rule.
-
Citations
12 Claims
-
1. A computer-implemented method for cleaning data stored in a database, the method comprising:
-
providing a data fixing rule to capture an error, the data fixing rule incorporating; a first set of attributes and respective first set of attribute values; a second attribute and respective second set of attribute values, wherein the second set of attribute values are erroneous values; and a correct value; applying the data fixing rule to the database to capture the error, wherein the error is captured when the first set of attributes and attribute values, and the second attribute and at least one of the erroneous values of the second attribute match a record in the database; and replacing the at least one erroneous value in the record with the correct value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for cleaning data stored in a database, the system comprising a processor configured to:
-
provide a data fixing rule to capture an error, the data fixing rule incorporating; a first set of attributes and respective first set of attribute values; a second attribute and respective second set of attribute values, wherein the second set of attribute values are erroneous values; and a correct value; applying the data fixing rule to the database to capture the error, wherein the error is captured when the first set of attributes and attribute values, and the second attribute and at least one of the erroneous values of the second attribute match a record in the database; and replacing the at least one erroneous value in the record with the correct value.
-
-
12. A non-transitory tangible computer readable storage medium comprising instructions for performing a process to be executed on a computer, the process comprising:
-
providing a data fixing rule to capture an error, the data fixing rule incorporating; a first set of attributes and respective first set of attribute values; a second attribute and respective second set of attribute values, wherein the second set of attribute values are erroneous values; and a correct value; applying the data fixing rules to the database to capture the error, wherein the error is captured when the first set of attributes and attribute values, and the second attribute and at least one of the erroneous values of the second attribute match a record in the database; and replacing the at least one erroneous value in the record with the correct value.
-
Specification