Optical object recognition method and system
First Claim
1. An optical object recognition method applying a post-processing method to blobs parsed within a region of interest in an array of data and remaining after an initial pass of a recognition routine, the method including the steps of:
- (a). dividing the region of interest into unknown regions by;
(i). locating and defining a first end point of a dimension of each unknown region as a beginning of an unrecognized blob including a portion of the array of data from the region of interest analyzed by an optical object recognition routine that yielded no recognized objects;
(ii). locating and defining a second end point of the dimension of each respective unknown region as a point at which unrecognized data stops; and
(iii). defining each unknown region as extending between respective first and second end points;
(b). analyzing each unknown region with a modified optical object recognition method by;
(i). placing an analysis window over a portion of the unknown region;
(ii). performing a correlation between a contents of the analysis window and a reference set of objects to determine whether the analysis window contains a recognizable object;
(iii). removing the contents of the analysis window from the unknown region if the analysis window contains a recognizable object and recording a recognized object;
(iv). moving the analysis window over a new portion of the unknown region if a remaining portion of the unknown region is larger than a predetermined size; and
(v). repeating steps (ii)-(iv) until the remaining portion of the unknown region is smaller than the predetermined size; and
(c). repeating steps (a) and (b) until all unknown regions have been analyzed.
4 Assignments
0 Petitions
Accused Products
Abstract
An iterative application of an optical object recognition method, such as an OCR method, with enhancements and modifications yields improved speed and accuracy. In the exemplary embodiment, after an initial pass of the OCR method on a document image, unrecognized blobs are grouped into unknown regions to which the OCR method is applied via an analysis window. If the contents of the analysis window remain unrecognized at a starting position in a given unknown region, the window can be moved within the unknown region to provide more opportunities to recognize the unknown region'"'"'s contents. Recognized characters are recorded, and the portions of the unknown regions in which they appeared are removed. This pass of the OCR method recognizes characters in blobs containing multiple characters. After all unknown regions have been analyzed, remaining unrecognized blobs are regrouped into new unknown regions according to particular relationships, such as spatial or spectral relationship, and predetermined criteria, such as relative distance or frequency distribution. The OCR method is then applied to the new unknown regions. This pass of the OCR method recognizes characters that have been split into multiple blobs.
-
Citations
49 Claims
-
1. An optical object recognition method applying a post-processing method to blobs parsed within a region of interest in an array of data and remaining after an initial pass of a recognition routine, the method including the steps of:
-
(a). dividing the region of interest into unknown regions by;
(i). locating and defining a first end point of a dimension of each unknown region as a beginning of an unrecognized blob including a portion of the array of data from the region of interest analyzed by an optical object recognition routine that yielded no recognized objects;
(ii). locating and defining a second end point of the dimension of each respective unknown region as a point at which unrecognized data stops; and
(iii). defining each unknown region as extending between respective first and second end points;
(b). analyzing each unknown region with a modified optical object recognition method by;
(i). placing an analysis window over a portion of the unknown region;
(ii). performing a correlation between a contents of the analysis window and a reference set of objects to determine whether the analysis window contains a recognizable object;
(iii). removing the contents of the analysis window from the unknown region if the analysis window contains a recognizable object and recording a recognized object;
(iv). moving the analysis window over a new portion of the unknown region if a remaining portion of the unknown region is larger than a predetermined size; and
(v). repeating steps (ii)-(iv) until the remaining portion of the unknown region is smaller than the predetermined size; and
(c). repeating steps (a) and (b) until all unknown regions have been analyzed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
(i). determining a relationship between two unrecognized blobs;
(ii). determining whether the relationship meets predetermined criteria;
(iii). grouping the two unrecognized blobs into one unknown region if the relationship meets the predetermined criteria; and
(iv). repeating steps (i) through (iii) until substantially all unrecognized blobs that can be grouped have been grouped.
-
-
4. The method of claim 3 wherein the relationship is a spatial relationship.
-
5. The method of claim 3 wherein the relationship is a spectral relationship.
-
6. The method of claim 1 wherein the analysis window includes an array of pixels from the unknown region.
-
7. The method of claim 6 wherein the array includes at least two dimensions.
-
8. The method of claim 1 wherein the array of data is an image of a document and the objects sought are characters of the document.
-
9. An optical object recognition method including the steps of:
-
(a). performing an initial OOR on a region of interest in an array of data;
(b). defining unrecognized blobs as portions of the region of interest containing substantially contiguous data from the array distinguished by yielding no recognized objects from the initial OOR;
(c). performing a first post-processing method on unrecognized blobs left from the initial OOR to separate objects from unrecognized blobs including more than one object;
(d). recognizing more than one object within at least one of the unrecognized blobs;
(e). performing a second post-processing method on unrecognized blobs left from the initial OOR and the first post-processing method to recognize objects split into multiple unrecognized blobs; and
(f). recognizing at least one single object split between more than one of the unrecognized blobs. - View Dependent Claims (10, 11, 12, 13, 14, 15)
(i). dividing the region of interest into unknown regions;
(ii). analyzing each unknown region with a modified optical object recognition method; and
(iii). repeating steps (i) and (ii) until substantially all unknown regions have been analyzed.
-
-
11. The method of claim 9 wherein the first and second post-processing methods apply a modified OOR method to analyze the unrecognized blobs, the unrecognized blobs having been organized into unknown regions and the modified OOR method as applied to a given unknown region including the steps of:
-
(i). placing an analysis window over a portion of the unknown region;
(ii). attempting recognition of contents of the analysis window to determine whether the analysis window contains a recognizable object;
(iii). removing the contents of the analysis window from the unknown region if the analysis window contains a recognizable object and recording a recognized object;
(iv). moving the analysis window over a new portion of the unknown region if a remaining portion of the unknown region is larger than a predetermined size; and
(v). repeating steps (ii) through (iv) until the remaining portion of the unknown region is smaller than a predetermined size.
-
-
12. The method of claim 11 wherein the analysis window includes an array of pixels from the unknown region.
-
13. The method of claim 12 wherein the array includes at least two dimensions.
-
14. The method of claim 11 wherein the step of attempting recognition of the contents of the analysis window includes performing a correlation of the current portion by:
-
(A). locating an unknown object in the analysis window;
(B). computing correlation values between the unknown object and each of a set of objects; and
(C). recognizing the unknown object as an object of the set of objects with which the unknown object has a highest correlation value.
-
-
15. The method of claim 14 wherein the set of objects is a trained set of objects.
-
16. An optical object recognition method including the steps of:
-
(a). performing an initial OOR on a region of interest in an array of data;
(b). performing a first post-processing method on unrecognized blobs left from the initial OOR to separate objects from unrecognized blobs including more than one object;
(c). performing a second post-processing method on unrecognized blobs left from the initial OOR and the first post-processing method to recognize objects split into multiple unrecognized blobs;
(d). the first post-processing method including the steps of;
(i). dividing the region of interest into unknown regions;
(ii). analyzing each unknown region with a modified optical object recognition method; and
(iii). repeating steps (i) and (ii) until substantially all unknown regions have been analyzed; and
(e). the step of dividing the region of interest into unknown regions including the steps of;
(A). locating a first point that is a beginning of an unrecognized blob left over from an initial optical object recognition, the blob including a portion of array of data analyzed by the initial optical object recognition routine that yielded no recognized objects;
(B). locating a second point to a right of the beginning of the unrecognized blob that is a point at which unrecognized data ends;
(C). defining an unknown region as extending from the first point to the second point; and
(D). repeating steps (A) through (C) until all unrecognized blobs of the image of the document are included in unknown regions. - View Dependent Claims (17)
-
-
18. An optical object recognition method including the steps of:
-
(a). performing an initial OOR on a region of interest in an array of data;
(b). performing a first post-processing method on unrecognized blobs left from the initial OOR to separate objects from unrecognized blobs including more than one object;
(c). performing a second post-processing method on unrecognized blobs left from the initial OOR and the first post-processing method to recognize objects split into multiple unrecognized blobs;
(d). the second post-processing method including the steps of;
(i). determining relationships between unrecognized blobs left over from an initial optical object recognition and the first post-processing method;
(ii). dividing the region of interest into unknown regions by;
(A). defining an unknown region as extending from a first point to a second point and including multiple unrecognized blobs when the relationships between the unrecognized blobs meet predetermined criteria;
(B). the first point being a beginning of an unrecognized blob left over from an initial optical object recognition, the blob containing a portion of an array of data analyzed by the initial optical object recognition routine that yielded no recognized objects; and
(C). the second point being a point at which unrecognized data ends; and
(D). repeating steps (A) through (C) until substantially all unrecognized blobs in the region of interest have been grouped into unknown regions;
(iii). analyzing each unknown region with a modified optical object recognition method; and
(iv). repeating steps (i) through (iii) until substantially all unknown regions have been analyzed. - View Dependent Claims (19, 20)
-
-
21. An optical object recognition method applying an OOR routine iteratively on a target document image including a region of interest, a first iteration of the OOR method leaving unrecognized blobs including portions of the target document image, the method including the steps of:
-
(a). dividing the unrecognized blobs in the region of interest into unknown regions each occupying contiguous areas among the unrecognized blobs;
(b). selecting one of the unknown regions in the region of interest;
(c). analyzing the unknown region with the OOR routine via an analysis window;
(d). removing a portion of the unknown region bounded by the analysis window if it includes a recognized object and recording any such recognized object in a memory;
(e). determining if the remaining unknown region is still large enough to include an object;
(f). moving the analysis window to a new part of the unknown region if the remaining unknown region is still large enough to include an object;
(g). repeating steps (c) through (f) until the remaining unknown region is not large enough to include an object;
(h). selecting another of the unknown regions in the region of interest for analysis; and
(i). repeating steps (c) through (h) until substantially all of the unknown regions in the region of interest have been analyzed. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28)
(i). locating a first point at which an unrecognized blob begins;
(ii). locating a second point at which unrecognized data ends; and
(iii). defining each of the unknown regions as extending between the first and second points.
-
-
23. The method of claim 21 including a further step of redividing the unrecognized blobs in the region of interest into new unknown regions that combine at least some of the remaining unknown regions according to predetermined criteria.
-
24. The method of claim 23 wherein the predetermined criteria include spatial characteristics of the blobs.
-
25. The method of claim 23 wherein the predetermined criteria include spectral characteristics of the blobs.
-
26. The method of claim 23 wherein the predetermined criteria include a condition that the two unrecognized blobs must be horizontally adjacent.
-
27. The method of claim 23 wherein the predetermined criteria include a condition that the two unrecognized blobs must be vertically adjacent.
-
28. The method of claim 21 wherein the OOR routine performs a method including the steps of:
-
(i). calculating correlation values between contents of the analysis window and each of a set of objects;
(ii). selecting an object of the set of objects corresponding to a highest of the correlation values as an object recognized in the contents of the analysis window; and
(iii). defining the object as a recognized object.
-
-
29. An optical object recognition (OOR) system including a computer with a processor connected to a memory, the OOR system executing an OOR method including the steps of:
-
(a). acquiring an array of data for analysis;
(b). storing the array in the memory;
(c). parsing a region of interest of the array into blobs to isolate as many individual objects in the image as possible into respective blobs;
(d). performing OOR on the blobs to recognize as many of the objects in their respective blobs as possible;
(e). removing any recognized objects and their respective blobs from the stored region of interest;
(f). recording the recognized objects in the memory;
(g). grouping any remaining blobs, characterized as unrecognized blobs, into unknown regions;
(h). performing OOR on each unknown region;
(i). removing any recognized objects from each unknown region;
(j). recording recognized objects in the memory;
(k). grouping any remaining unrecognized blobs into new unknown regions;
(l). performing OOR on each unknown region;
(m). removing any recognized objects in the memory;
(n). recording recognized objects in the memory;
(o). reordering recognized objects recorded in the memory in an order in which they appeared in the array of data; and
(p). sending the reordered recognized objects to an output. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
(i). locating a beginning of an unrecognized blob;
(ii). defining a beginning of an unknown region as the beginning of the unrecognized blob;
(iii). locating a point at which unrecognized data stops;
(iv). defining an end of the unknown region as the point at which unrecognized data stops; and
(v). repeating steps (i) through (iv) until substantially all unrecognized blobs in the region of interest have been grouped into unknown regions.
-
-
34. The system of claim 33 wherein the step (k) of grouping any remaining blobs into new unknown regions further includes the step of determining according to predetermined criteria whether multiple blobs should be grouped into one unknown region.
-
35. The system of claim 34 wherein the predetermined criteria include spatial characteristics of the blobs.
-
36. The system of claim 34 wherein the predetermined criteria include spectral characteristics of the blobs.
-
37. The system of claim 34 wherein the predetermined criteria include intensity data associated with the blobs.
-
38. The system of claim 33 wherein each unknown region includes a second dimension, the extent of which is determined by the objects they encompass.
-
39. The system of claim 38 wherein the unknown regions are rectangles whose heights are determined by steps (i) through (v) and whose heights are determined by a tallest of the objects each unknown region contains.
-
40. An optical object recognition method applying an OOR routine in a post-processing method to unrecognized blobs remaining from an initial OOR including the steps of:
-
(a). dividing the unrecognized blobs into unknown regions each including substantially contiguous unrecognized data between a beginning of an unrecognized blob and an end of unrecognized data;
(b). positioning an analysis window within each of the unknown regions;
(c). applying the OOR routine to examine contents of the analysis window;
(d). removing recognized objects from each of the unknown regions;
(e). determining if a remaining size of each of the unknown regions is large enough to contain an additional object;
(f). moving the analysis window to a new position within each of the unknown regions that are large enough to contain an additional object; and
(g). repeating steps (c) to (f) until the size of each of the unknown regions is not large enough to contain an additional object. - View Dependent Claims (41, 42, 43, 44, 45, 46, 47, 48, 49)
redividing the unrecognized blobs into new unknown regions by combining the unknown regions that are not large enough to contain an additional object according to particular relationships between the unrecognized blobs.
-
-
43. The system of claim 42 wherein the particular relationships are spatial relationships.
-
44. The system of claim 42 wherein the particular relationships are spectral relationships.
-
45. The system of claim 42 wherein regrouping is further performed by defining each new unknown region as including substantially contiguous unrecognized data between the beginning of one of the unrecognized blobs and the end of the unrecognized data.
-
46. The system of claim 42 wherein the OOR routine uses statistical correlation between an unknown object and each of a set of objects.
-
47. The system of claim 46 wherein the set of objects is trained from the array of data.
-
48. The system of claim 46 wherein the set of objects is a pretrained set of objects stored in a memory prior to analysis of the array of data.
-
49. The system of claim 46 wherein the set of objects is based on a set of mathematical descriptions of object shapes.
Specification