System and method for variant string matching
First Claim
1. A computer implemented method for variant string matching, comprising:
- comparing with a computing device two unidentical strings in a training variant string pair, the two unidentical strings representing the same item from training data in a memory, to determine if they include an identical substring pair and a first unidentical substring pair including a first unidentical substring and a second unidentical substring;
determining if the first unidentical substring pair is in the training data;
entering the first unidentical substring pair into the training data as a first variant string pair if it is not in the training data;
comparing with the computing device the two unidentical strings to determine if they include an interchangeable substring pair and a second unidentical substring pair including a third unidentical substring and a fourth unidentical substring;
determining if the second unidentical substring pair is in the training data; and
entering the second unidentical substring pair into the training data as a second variant string pair if it is not in the training data.
5 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and system for variant string matching. A computer implemented method for variant string matching may comprise comparing with a computing device two unidentical strings in a training variant string pair. The two unidentical strings may represent the same item from training data, which may be stored in a memory. The two unidentical strings may be compared to determine if they include an identical substring pair, and a first unidentical substring pair. The computer implemented method may also determine if the first unidentical substring pair includes a first unidentical substring and a second unidentical substring. The computer implemented method may further determine if the first unidentical substring pair is in the training data. The first unidentical substring pair may be entered into the training data as a first variant string pair if it is not in the training data.
19 Citations
14 Claims
-
1. A computer implemented method for variant string matching, comprising:
-
comparing with a computing device two unidentical strings in a training variant string pair, the two unidentical strings representing the same item from training data in a memory, to determine if they include an identical substring pair and a first unidentical substring pair including a first unidentical substring and a second unidentical substring; determining if the first unidentical substring pair is in the training data; entering the first unidentical substring pair into the training data as a first variant string pair if it is not in the training data; comparing with the computing device the two unidentical strings to determine if they include an interchangeable substring pair and a second unidentical substring pair including a third unidentical substring and a fourth unidentical substring; determining if the second unidentical substring pair is in the training data; and entering the second unidentical substring pair into the training data as a second variant string pair if it is not in the training data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon, which, when executed by a processor, cause the processor to perform operations comprising:
-
comparing two unidentical strings in a training variant string pair, the two unidentical strings representing the same item from training data in a memory, to determine if they include an identical substring pair and a first unidentical substring pair including a first unidentical substring and a second unidentical substring; determining if the first unidentical substring pair is in the training data; and entering the first unidentical substring pair into the training data as a first variant string pair if it is not in the training data comparing the two unidentical strings to determine if they include an interchangeable substring pair and a second unidentical substring pair including a third unidentical substring and a fourth unidentical substring; determining if the second unidentical substring pair is in the training data; and entering the second unidentical substring pair into the training data as a second variant string pair if it is not in the training data. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification