Method and system for identifying anchors for fields using optical character recognition data
First Claim
1. A system for identifying anchors for fields using optical character recognition data, the system comprising:
- one or more processors; and
a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to;
identify a first collection of characters comprising a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document, wherein the first set of characters is associated with a first word and the second set of characters is associated with a second word;
create a first anchor in the first document based on the first collection of characters, wherein the first anchor is at a third position relative to the first field in the first document, and wherein the first anchor is associated with a second field in the first document;
identify a second collection of characters comprising a third set of characters at a fourth position relative to a third field in a second document and a fourth set of characters at a fifth position relative to the third field in the second document, wherein the third set of characters is associated with a third word and the fourth set of characters is associated with a fourth word;
determine a location of a second anchor in the second document by calculating a vector based on the first, second, third and fourth sets of characters; and
identify a fourth field in the second document that corresponds to the second field in the first document based on the location of the second anchor in the second document.
12 Assignments
0 Petitions
Accused Products
Abstract
Identifying anchors for fields using optical character recognition data is described. A collection of characters is identified. The collection of characters includes a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document. The first set of characters is associated with a first word, and the second set of characters is associated with a second word. An anchor is created based on the collection of characters, wherein the anchor is at a third relative position to the first field in the first document. A second field is identified in a second document by identifying the anchor in the second document.
-
Citations
20 Claims
-
1. A system for identifying anchors for fields using optical character recognition data, the system comprising:
-
one or more processors; and a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to; identify a first collection of characters comprising a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document, wherein the first set of characters is associated with a first word and the second set of characters is associated with a second word; create a first anchor in the first document based on the first collection of characters, wherein the first anchor is at a third position relative to the first field in the first document, and wherein the first anchor is associated with a second field in the first document; identify a second collection of characters comprising a third set of characters at a fourth position relative to a third field in a second document and a fourth set of characters at a fifth position relative to the third field in the second document, wherein the third set of characters is associated with a third word and the fourth set of characters is associated with a fourth word; determine a location of a second anchor in the second document by calculating a vector based on the first, second, third and fourth sets of characters; and identify a fourth field in the second document that corresponds to the second field in the first document based on the location of the second anchor in the second document. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method for identifying anchors for fields using optical character recognition data, the method comprising:
-
identifying a first collection of characters comprising a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document, wherein the first set of characters is associated with a first word and the second set of characters is associated with a second word; combining the first set of characters with the second set of characters to create a first anchor in the first document based on the first collection of characters, wherein the first anchor is at a third position relative to the first field in the first document, and wherein the first anchor is associated with a second field in the first document; identifying a second collection of characters comprising a third set of characters at a fourth position relative to a third field in a second document and a fourth set of characters at a fifth position relative to the third field in the second document, wherein the third set of characters is associated with a third word and the fourth set of characters is associated with a fourth word; determining a location of a second anchor in the second document by calculating a vector based on the first, second, third and fourth sets of characters; and identifying a fourth field in the second document that corresponds to the second field in the first document based on the location of the second anchor in the second document. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, the computer-readable program code adapted to be executed by one or more processors to implement a method for identifying anchors for fields using optical character recognition data, the method comprising:
-
identifying a first collection of characters comprising a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document, wherein the first set of characters is associated with a first word and the second set of characters is associated with a second word; combining the first set of characters with the second set of characters to create a first anchor in the first document based on the first collection of characters, wherein the first anchor is at a third position relative to the first field in the first document, and wherein the first anchor is associated with a second field in the first document; identifying a second collection of characters comprising a third set of characters at a fourth position relative to a third field in a second document and a fourth set of characters at a fifth position relative to the third field in the second document, wherein the third set of characters is associated with a third word and the fourth set of characters is associated with a fourth word; determining a location of a second anchor in the second document by calculating a vector based on the first, second, third and fourth sets of characters; and identifying a fourth field in the second document that corresponds to the second field in the first document based on the location of the second anchor in the second document. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification