SYSTEM FOR MATCHING PATTERN-BASED DATA
First Claim
1. A system for matching pattern-based data, comprising:
- a pattern construction module for deriving a first pattern from a first input set of values and a second pattern from a second input set of values;
a similarity computation module for computing a similarity of the first pattern and the second pattern; and
a matching module for matching the first input set of values with the second input set of values based on the similarity computation.
0 Assignments
0 Petitions
Accused Products
Abstract
A pattern-based data matching system matches pattern-based data. The data matching system generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching system analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present system computes a compact pattern describing the sequence. Embodiments of the data matching system comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.
27 Citations
6 Claims
-
1. A system for matching pattern-based data, comprising:
-
a pattern construction module for deriving a first pattern from a first input set of values and a second pattern from a second input set of values; a similarity computation module for computing a similarity of the first pattern and the second pattern; and a matching module for matching the first input set of values with the second input set of values based on the similarity computation. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer program product having a plurality of executable instruction codes that are stored on a computer-readable medium, for matching pattern-based data, comprising:
-
a first set of instruction codes for deriving a first pattern from a first input set of values and a second pattern from a second input set of values; a second set of instruction codes for computing a similarity of the first pattern and the second pattern; and a third set of instruction codes for matching the first input set of values with the second input set of values based on the similarity computation.
-
Specification