Optical character recognition based on shape clustering and multiple optical character recognition processes
First Claim
Patent Images
1. A system for optical character recognition (OCR), comprising:
- a plurality of OCR engines each operable to process an original image of a document and to produce a respective OCR output;
a plurality of post-OCR processing engines each operable to receive an OCR output from a respective OCR engine and operable to produce a respective modified OCR output of the document; and
a vote processing engine operable to select portions from the plurality of modified OCR outputs and to assemble the selected portions into a final OCR output for the document;
wherein each post-OCR processing engine is operable to;
classify clip images defined in a received OCR output for the document into a plurality of clusters of clip images, each cluster comprising clip images of similar image sizes and shapes that are assigned the same one or more particular characters by the corresponding OCR engine; and
generate a cluster image to represent clip images in each cluster;
and wherein the vote processing engine is operable to;
use shape differences between a cluster image of each cluster and cluster images of other clusters to detect whether an error exists in the one or more particular characters assigned to each cluster by the corresponding OCR engine;
correct each detected error in a particular cluster by newly assigning one or more particular characters to the particular cluster; and
use the newly assigned one or more particular characters for the particular cluster to replace respective one or more particular characters previously assigned by the corresponding OCR engine in a corresponding modified OCR output.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process.
-
Citations
39 Claims
-
1. A system for optical character recognition (OCR), comprising:
-
a plurality of OCR engines each operable to process an original image of a document and to produce a respective OCR output; a plurality of post-OCR processing engines each operable to receive an OCR output from a respective OCR engine and operable to produce a respective modified OCR output of the document; and a vote processing engine operable to select portions from the plurality of modified OCR outputs and to assemble the selected portions into a final OCR output for the document; wherein each post-OCR processing engine is operable to; classify clip images defined in a received OCR output for the document into a plurality of clusters of clip images, each cluster comprising clip images of similar image sizes and shapes that are assigned the same one or more particular characters by the corresponding OCR engine; and generate a cluster image to represent clip images in each cluster; and wherein the vote processing engine is operable to; use shape differences between a cluster image of each cluster and cluster images of other clusters to detect whether an error exists in the one or more particular characters assigned to each cluster by the corresponding OCR engine; correct each detected error in a particular cluster by newly assigning one or more particular characters to the particular cluster; and use the newly assigned one or more particular characters for the particular cluster to replace respective one or more particular characters previously assigned by the corresponding OCR engine in a corresponding modified OCR output. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for optical character recognition (OCR), comprising:
-
using a plurality of OCR engines to process an original image of a document and to produce a plurality of OCR outputs, respectively; processing each of the OCR outputs separately from processing other OCR output to produce a respective modified OCR output of the document, the processing including; classifying clip images defined in a received OCR output for the document into a plurality of clusters of clip images, each cluster comprising clip images of similar image sizes and shapes that are assigned the same one or more particular characters by the corresponding OCR engine, generating a cluster image to represent clip images in each cluster, using shape differences between a cluster image of each cluster and cluster images of other clusters to detect whether an error exists in the one or more particular characters assigned to each cluster by the corresponding OCR engine, correcting each detected error in a particular cluster by newly assigning one or more particular characters to the particular cluster, and using the newly assigned one or more particular characters for the particular cluster to replace respective one or more particular characters previously assigned by the corresponding OCR engine in a corresponding modified OCR output; and selecting portions from the plurality of modified OCR outputs and to assemble the selected portions into a final OCR output for the document. - View Dependent Claims (9, 10, 11)
-
-
12. A computer program product, encoded on a computer-readable medium, operable to cause data processing apparatus to perform operations comprising:
-
using a plurality of optical character recognition (OCR) engines to process an original image of a document and to produce a plurality of OCR outputs, respectively; processing each of the OCR outputs separately from processing other OCR output to produce a respective modified OCR output of the document, the processing including; classifying clip images defined in a received OCR output for the document into a plurality of clusters of clip images, each cluster comprising clip images of similar image sizes and shapes that are assigned the same one or more particular characters by the corresponding OCR engine, generating a cluster image to represent clip images in each cluster, using shape differences between a cluster image of each cluster and cluster images of other clusters to detect whether an error exists in the one or more particular characters assigned to each cluster by the corresponding OCR engine, correcting each detected error in a particular cluster by newly assigning one or more particular characters to the particular cluster, and using the newly assigned one or more particular characters for the particular cluster to replace respective one or more particular characters previously assigned by the corresponding OCR engine in a corresponding modified OCR output; and selecting portions from the plurality of modified OCR outputs and to assemble the selected portions into a final OCR output for the document.
-
-
13. A method, comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image and respective one or more characters assigned to each first clip image; processing the document image with a second OCR engine to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image and respective one or more characters assigned to each second clip image; applying shape clustering to the first OCR output to produce first clusters with first clip images and a respective confidence score for each assignment of one or more characters to a first clip image; applying shape clustering to the second OCR output to produce second clusters with second clip images and a respective confidence score for each assignment of one or more characters to a second clip image; and generating a final OCR output from the first OCR output and the second OCR output, the final OCR output comprising bounding boxes and using the confidence scores for assignments of the one or more characters to the first clip images and the second clip images to select and assign respective one or more characters to each of the bounding boxes. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer program product, encoded on a computer-readable medium, operable to cause data processing apparatus to perform operations comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image and respective one or more characters assigned to each first clip image; processing the document image with a second OCR engine to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image and respective one or more characters assigned to each second clip image; applying shape clustering to the first OCR output to produce first clusters with first clip images and a respective confidence score for each assignment of one or more characters to a first clip image; applying shape clustering to the second OCR output to produce second clusters with second clip images and a respective confidence score for each assignment of one or more characters to a second clip image; and generating a final OCR output from the first OCR output and the second OCR output, the final OCR output comprising bounding boxes and using the confidence scores for assignments of the one or more characters to the first clip images and the second clip images to select and assign respective one or more characters to each of the bounding boxes.
-
-
19. A system for optical character recognition (OCR), comprising:
-
a first OCR engine operable to process a document image to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image and respective one or more characters assigned to each first clip image; a first post-OCR engine operable to apply shape clustering to the first OCR output to produce first clusters with first clip images and a respective confidence score for each assignment of one or more characters to a first clip image; a second OCR engine operable to process the document image to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image and respective one or more characters assigned to each second clip image; a second post-OCR engine operable to apply shape clustering to the second OCR output to produce second clusters with second clip images and a respective confidence score for each assignment of one or more characters to a second clip image; and a vote processing engine to receive and process the first OCR output and the second OCR output and to produce a final OCR output from the first and second clusters in based on confidence scores. - View Dependent Claims (20, 21, 22, 23)
-
-
24. A method, comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image, the first OCR output further comprising a respective one or more characters assigned to each first clip image; processing the document image with a second OCR engine to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image, the second OCR output further comprising a respective one or more characters assigned to each second clip image; classifying the first clip images and the second clip images into clusters, each cluster including only clip images having the same one or more characters assigned to the clip image; generating a cluster image for each cluster; using the cluster images to verify or correct the assignment of characters to clip images and determine a confidence score for each assignment of one or more characters to a clip image; and using the assignments of characters to the cluster images to generate a final OCR output. - View Dependent Claims (25, 26, 27)
-
-
28. A computer program product, encoded on a computer-readable medium, operable to cause data processing apparatus to perform operations comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image, the first OCR output further comprising a respective one or more characters assigned to each first clip image; processing the document image with a second OCR engine to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image, the second OCR output further comprising a respective one or more characters assigned to each second clip image; classifying the first clip images and the second clip images into clusters, each cluster including only clip images having the same one or more characters assigned to the clip image; generating a cluster image for each cluster; using the cluster images to verify or correct the assignment of characters to clip images and determine a confidence score for each assignment of one or more characters to a clip image; and using the assignments of characters to the cluster images to generate a final OCR output.
-
-
29. A system for optical character recognition (OCR), comprising:
-
a first OCR engine operable to process a document image to generate first OCR output, the first OCR output comprising first bounding boxes identifying first clip images located in the document image, the first OCR output further comprising a respective one or more characters assigned to each first clip image; a second OCR engine operable to process the document image to generate second OCR output, the second OCR output comprising second bounding boxes identifying second clip images located in the document image, the second OCR output further comprising a respective one or more characters assigned to each second clip image; a post-OCR engine to receive the first and second OCR outputs and to classify the first clip images and the second clip images into clusters, each cluster including only clip images having the same one or more characters assigned to the clip image and a cluster image representing clip images for each cluster; and a vote processing engine operable to generate a final OCR output based on assignments of characters to the cluster images from the post-OCR engine. - View Dependent Claims (30, 31, 32)
-
-
33. A method, comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising bounding boxes identifying clip images located in the document image and a character assignment assigning one or more characters to each clip image; applying shape clustering to the first OCR output to produce a first modified OCR output, the first modified OCR output comprising a modification of the assignment of characters to clip images, the first modified OCR output further comprising words recognized in the document image; identifying a suspect word in the first modified OCR output, the suspect word being a word having a character identified as a suspect character; and processing the suspect word with a second OCR engine to recognize the suspect word. - View Dependent Claims (34, 35, 36)
-
-
37. A computer program product, encoded on a computer-readable medium, operable to cause data processing apparatus to perform operations comprising:
-
processing a document image with a first optical character recognition (OCR) engine to generate first OCR output, the first OCR output comprising bounding boxes identifying clip images located in the document image and a character assignment assigning one or more characters to each clip image; applying shape clustering to the first OCR output to produce a first modified OCR output, the first modified OCR output comprising a modification of the assignment of characters to clip images, the first modified OCR output further comprising words recognized in the document image; identifying a suspect word in the first modified OCR output, the suspect word being a word having a character identified as a suspect character; and processing the suspect word with a second OCR engine to recognize the suspect word.
-
-
38. A system for optical character recognition (OCR), comprising:
-
a first OCR engine operable to process a document image to generate first OCR output, the first OCR output comprising bounding boxes identifying clip images located in the document image and a character assignment assigning one or more characters to each clip image; a first post-OCR engine operable to apply shape clustering to the first OCR output to produce a first modified OCR output, the first modified OCR output comprising a modification of the assignment of characters to clip images, the first modified OCR output further comprising words recognized in the document image, wherein the first post-OCR engine is operable to identify a suspect word in the first modified OCR output, the suspect word being a word having a character identified as a suspect character; and a second OCR engine operable to receive and process the suspect word to recognize the suspect word. - View Dependent Claims (39)
-
Specification