Enterprise data duplication identification
First Claim
1. A computer-implemented method for identifying duplicate data, the method comprising the steps, performed by a computer, of:
- identifying one or more reference fields that include one or more data values;
retrieving the reference fields;
generating one or more reference fingerprint patterns;
transforming the reference fields into the one or more reference fingerprint patterns;
identifying one or more target fields that include one or more data values;
retrieving the target fields;
generating one or more target fingerprint patterns;
transforming the target fields into the one or more target fingerprint patterns;
comparing the one or more target fingerprint patterns with the one or more reference fingerprint patterns; and
determining an overlap between the one or more target fingerprint patterns and the one or more reference fingerprint patterns to identify duplicate data, wherein the one or more reference fingerprint patterns and one or more target fingerprint patterns include one or more letters and one or more numbers.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and computer program products are provided for identifying duplicate data. In one exemplary embodiment, there is provided a method for identifying duplicate data. The method may include identifying one or more reference fields that include one or more data values. The method may include retrieving the one or more reference fields and one or more data values. The method may also include transforming the one or more reference fields into one or more reference fingerprint patterns. The method may also include identifying one or more target fields that include one or more target field values. The method may also include retrieving the one or more target fields. The method may also include transforming the one or more target field values into one or more target fingerprint patterns. The method may also include comparing the one or more reference fingerprint patterns with the one or more target fingerprint patterns. The method may further include determining an overlap between the one or more reference fingerprint patterns and the one or more target fingerprint patterns.
16 Citations
15 Claims
-
1. A computer-implemented method for identifying duplicate data, the method comprising the steps, performed by a computer, of:
-
identifying one or more reference fields that include one or more data values;
retrieving the reference fields;generating one or more reference fingerprint patterns; transforming the reference fields into the one or more reference fingerprint patterns; identifying one or more target fields that include one or more data values; retrieving the target fields;
generating one or more target fingerprint patterns;transforming the target fields into the one or more target fingerprint patterns; comparing the one or more target fingerprint patterns with the one or more reference fingerprint patterns; and determining an overlap between the one or more target fingerprint patterns and the one or more reference fingerprint patterns to identify duplicate data, wherein the one or more reference fingerprint patterns and one or more target fingerprint patterns include one or more letters and one or more numbers. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-readable medium containing instructions which when executed on a processor performs a method for identifying duplicate data, the method comprising:
-
identifying one or more reference fields that include one or more data values; retrieving the reference fields; generating one or more reference fingerprint patterns; transforming the reference fields into the one or more reference fingerprint patterns; identifying one or more target fields that include one or more data values; retrieving the target fields; generating one or more target fingerprint patterns; transforming the target fields into the one or more target fingerprint patterns; comparing the one or more target fingerprint patterns with the one or more reference fingerprint patterns; and determining an overlap between the one or more target fingerprint patterns and the one or more reference fingerprint patterns to identify duplicate data, wherein the one or more reference fingerprint patterns and one or more target fingerprint patterns include one or more letters and one or more numbers. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system for identifying duplicate data, comprising:
-
one or more systems that include one or more target fields and data values; and a duplication identification system comprising one or more processors, wherein the duplication identification system; identifies one or more reference fields that include one or more data values; retrieves the reference fields; generates one or more reference fingerprint patterns; transforms the reference fields into the one or more reference fingerprint patterns; identifies one or more target fields that include one or more data values; retrieves the target fields; generates one or more target fingerprint patterns; transforms the target fields into the one or more target fingerprint patterns; compares the one or more target fingerprint patterns with the one or more reference fingerprint patterns; and determines an overlap between the one or more target fingerprint patterns and the one or more reference fingerprint patterns to identify duplicate data, wherein the one or more reference fingerprint patterns and one or more target fingerprint patterns include one or more letters and one or more numbers. - View Dependent Claims (12, 13, 14, 15)
-
Specification