Method and system for character recognition
First Claim
Patent Images
1. A computer-implemented method comprising:
- obtaining an image of a sequence of symbols, based on a document capture process performed on a rendered document;
segmenting a portion of the image into multiple segmented sub-images, each of the segmented sub-images corresponding to a single symbol in the sequence of symbols;
for each segmented sub-image of at least a subset of the segmented sub-images, in an order that corresponds to the sequence of symbols;
determining whether one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of any stored sub-image in a cache of stored sub-images;
based on determining that one or more features of the segmented sub-image are not classified as being similar to one or more corresponding features of any stored sub-image, selecting an unassigned symbol identifier, and storing the segmented sub-image in association with the selected symbol identifier in the cache of stored sub-images; and
attributing the selected identifier to any segmented sub-images that follow the segmented sub-image and have one or more features that are classified as being similar to one or more corresponding features of the stored segmented sub-image; and
generating an encoding of the sequence of symbols that is based on the identifiers attributed to the segmented sub-images.
4 Assignments
0 Petitions
Accused Products
Abstract
Character recognition is described. In one embodiment, it may use matched sequences rather than character shape to determine a computer-legible result.
-
Citations
18 Claims
-
1. A computer-implemented method comprising:
-
obtaining an image of a sequence of symbols, based on a document capture process performed on a rendered document; segmenting a portion of the image into multiple segmented sub-images, each of the segmented sub-images corresponding to a single symbol in the sequence of symbols; for each segmented sub-image of at least a subset of the segmented sub-images, in an order that corresponds to the sequence of symbols; determining whether one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of any stored sub-image in a cache of stored sub-images; based on determining that one or more features of the segmented sub-image are not classified as being similar to one or more corresponding features of any stored sub-image, selecting an unassigned symbol identifier, and storing the segmented sub-image in association with the selected symbol identifier in the cache of stored sub-images; and attributing the selected identifier to any segmented sub-images that follow the segmented sub-image and have one or more features that are classified as being similar to one or more corresponding features of the stored segmented sub-image; and generating an encoding of the sequence of symbols that is based on the identifiers attributed to the segmented sub-images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system, comprising:
-
one or more computing processors; and a computer-readable storage device including instructions executable by the one or more computing processors and upon such execution cause the one or more computing processors to perform operations comprising; obtaining an image of a sequence of symbols, based on a document capture process performed on a rendered document; segmenting a portion of the image into multiple segmented sub-images, each of the segmented sub-images corresponding to a single symbol in the sequence of symbols; for each segmented sub-image of at least a subset of the segmented sub-images, in an order that corresponds to the sequence of symbols; determining whether one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of any stored sub-image in a cache of stored sub-images; based on determining that one or more features of the segmented sub-image are not classified as being similar to one or more corresponding features of any stored sub-image, selecting an unassigned symbol identifier, and storing the segmented sub-image in association with the selected symbol identifier in the cache of stored sub-images; and attributing the selected identifier to any segmented sub-images that follow the segmented sub-image and have one or more features that are classified as being similar to one or more corresponding features of the stored segmented sub-image; and generating an encoding of the sequence of symbols that is based on the identifiers attributed to the segmented sub-images. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A memory storage apparatus storing instructions executable by a data processing apparatus and that upon such execution cause the data processing apparatus to perform operations comprising:
-
obtaining an image of a sequence of symbols, based on a document capture process performed on a rendered document; segmenting a portion of the image into multiple segmented sub-images, each of the segmented sub-images corresponding to a single symbol in the sequence of symbols; for each segmented sub-image of at least a subset of the segmented sub-images, in an order that corresponds to the sequence of symbols; determining whether one or more features of the segmented sub-image are classified as being similar to one or more corresponding features of any stored sub-image in the cache of stored sub-images; based on determining that one or more features of the segmented sub-image are not classified as being similar to one or more corresponding features of any stored sub-image, selecting an unassigned symbol identifier, and storing the segmented sub-image in association with the selected symbol identifier in the cache of stored sub-images; and attributing the selected identifier to any segmented sub-images that follow the segmented sub-image and have one or more features that are classified as being similar to one or more corresponding features of the stored segmented sub-image; and generating an encoding of the sequence of symbols that is based on the identifiers attributed to the segmented sub-images. - View Dependent Claims (16, 17, 18)
-
Specification