Method and system for character recognition
First Claim
Patent Images
1. An article of manufacture comprising a non-transitory computer-readable medium with instructions encoded thereon, the instructions configured to cause one or more processors to perform a method comprising:
- obtaining an image based on a document capture process performed on a rendered document;
identifying a portion of the image, the portion comprising a sequence of text units;
segmenting the portion of the image into a sequence of segmented sub-images, each segmented sub-image comprising a single text unit of the sequence of text units;
for each segmented sub-image of the sequence of segmented sub-images;
determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of a stored sub-image; and
based on determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of the stored sub-image, assigning to the segmented sub-image a text unit identity that is associated with the stored sub-image;
generating a representation of the portion of the image, based on the assigned text unit identities; and
identifying the sequence of segmented sub-images, based on the generated representation.
4 Assignments
0 Petitions
Accused Products
Abstract
Character recognition is described. In one embodiment, it may use matched sequences rather than character shape to determine a computer-legible result.
1177 Citations
21 Claims
-
1. An article of manufacture comprising a non-transitory computer-readable medium with instructions encoded thereon, the instructions configured to cause one or more processors to perform a method comprising:
-
obtaining an image based on a document capture process performed on a rendered document; identifying a portion of the image, the portion comprising a sequence of text units; segmenting the portion of the image into a sequence of segmented sub-images, each segmented sub-image comprising a single text unit of the sequence of text units; for each segmented sub-image of the sequence of segmented sub-images; determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of a stored sub-image; and based on determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of the stored sub-image, assigning to the segmented sub-image a text unit identity that is associated with the stored sub-image; generating a representation of the portion of the image, based on the assigned text unit identities; and identifying the sequence of segmented sub-images, based on the generated representation. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
one or more data processing apparatus; and a computer-readable storage device including instructions executable by the data processing apparatus and upon such execution cause the data processing apparatus to perform operations comprising; obtaining an image based on a document capture process performed on a rendered document; identifying a portion of the image, the portion comprising a sequence of text units; segmenting the portion of the image into a sequence of segmented sub-images, each segmented sub-image comprising a single text unit of the sequence of text units; for each segmented sub-image of the sequence of segmented sub-images; determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of a stored sub-image; and based on determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of the stored sub-image, assigning to the segmented sub-image a text unit identity that is associated with the stored sub-image; generating a representation of the portion of the image, based on the assigned text unit identities; and identifying the sequence of segmented sub-images, based on the generated representation. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method, comprising:
-
obtaining an image based on a document capture process performed on a rendered document; identifying a portion of the image, the portion comprising a sequence of text units; segmenting the portion of the image into a sequence of segmented sub-images, each segmented sub-image comprising a single text unit of the sequence of text units; for each segmented sub-image of the sequence of segmented sub-images; determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of a stored sub-image; and based on determining that one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of the stored sub-image, assigning to the segmented sub-image a text unit identity that is associated with the stored sub-image; generating a representation of the portion of the image, based on the assigned text unit identities; and identifying the sequence of segmented sub-images, based on the generated representation. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification