OCR method and apparatus using image equivalents
First Claim
1. A method for recognizing words in a text on a medium comprising the steps of:
- scanning the medium to generate bit mapped images of characters forming the words on the medium,comparing a bit mapped image of a first number of bit mapped characters in a first word to one or more bit mapped images of characters in other words in order to identify equivalent bit mapped images in said document;
comparing the equivalent bit mapped images of the document to reference characters in order to recognize each bit mapped character as one of said reference characters;
comparing the corresponding recognized reference characters of each equivalent image to each other and selecting the reference characters identified for the equivalent bit mapped characters when the same reference characters are identified for corresponding bit mapped characters in said equivalent images; and
further processing said bit mapped images when different reference characters are recognized for corresponding equivalent bit mapped characters.
1 Assignment
0 Petitions
Accused Products
Abstract
An OCR 300 stores signals representative of reference characters and scans a document 302 to generate a bit mapped digitized image of the document. After the characters and the words are recognized and candidate characters are identified, the initial results are post-processed to compare clusters of identical images to the candidates. Where the candidates of all equivalent images in a cluster are the same, the candidates are output as representative of the image on the document. Where the candidates are different, a majority of identical candidates determines the recognized candidates. Other post-processing operations include verification and re-recognition.
-
Citations
24 Claims
-
1. A method for recognizing words in a text on a medium comprising the steps of:
-
scanning the medium to generate bit mapped images of characters forming the words on the medium, comparing a bit mapped image of a first number of bit mapped characters in a first word to one or more bit mapped images of characters in other words in order to identify equivalent bit mapped images in said document; comparing the equivalent bit mapped images of the document to reference characters in order to recognize each bit mapped character as one of said reference characters; comparing the corresponding recognized reference characters of each equivalent image to each other and selecting the reference characters identified for the equivalent bit mapped characters when the same reference characters are identified for corresponding bit mapped characters in said equivalent images; and further processing said bit mapped images when different reference characters are recognized for corresponding equivalent bit mapped characters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for recognizing words in a text on a medium comprising:
-
an optical character recognition machine for scanning the medium and generating bit mapped images of characters forming the words on the medium; an image comparator for comparing a first bit mapped image of a first number of characters in one word to one or more bit mapped images of characters in other words in order to identify equivalent bit mapped images of said first bit mapped image; a reference character recognizer for comparing the equivalent bit mapped images to reference characters in order to recognize each bit mapped character as one of said reference characters; means for comparing the corresponding recognized reference characters of each equivalent image to each other and selecting the reference characters identified for equivalent bit mapped characters when the same reference characters are identified for corresponding bit mapped characters in said equivalent bit mapped images; and means for further processing said bit mapped images when different reference characters are recognized for corresponding bit mapped characters in said equivalent bit mapped images. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification