Language-independent and segmentation-free optical character recognition system and method
First Claim
1. An optical character recognition system for recognizing each character of a text, comprising:
- means for (a) determining the position of each line of characters of the text, (b) scanning each line of text by scanning said text in successive frames of a predetermined height and width independent of the script and language of said text, and (c) generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said variable being a function of a feature vector representing a plurality of elements, each element being a function of the percentile height below which a prespecified percentage of black pixels lies;
means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data.
11 Assignments
0 Petitions
Accused Products
Abstract
A language-independent and segment free OCR system and method comprises a unique feature extraction approach which represents two dimensional data relating to OCR as one independent variable (specifically the position within a line of text in the direction of the line) so that the same CSR technology based on HMMs can be adapted in a straightforward manner to recognize optical characters. After a line finding stage, followed by a simple feature-extraction stage, the system can utilize a commercially available CSR system, with little or no modification, to perform the recognition of text by and training of the system. The whole system, including the feature extraction, training, and recognition components, are designed to be independent of the script or language of the text being recognized. The language-dependent parts of the system are confined to the lexicon and training data. Furthermore, the method of recognition does not require pre-segmentation of the data at the character and/or word levels, neither for training nor for recognition. In addition, a language model can be used to enhance system performance as an integral part of the recognition process and not as a post-process, as is commonly done with spell checking, for example.
-
Citations
36 Claims
-
1. An optical character recognition system for recognizing each character of a text, comprising:
-
means for (a) determining the position of each line of characters of the text, (b) scanning each line of text by scanning said text in successive frames of a predetermined height and width independent of the script and language of said text, and (c) generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said variable being a function of a feature vector representing a plurality of elements, each element being a function of the percentile height below which a prespecified percentage of black pixels lies; means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method of recognizing text regardless of the script or language of the text and without segmenting the text by character or word, said method comprising:
-
scanning successive frames of a predetermined height and width of each line of the text in the direction of each line so as to generate data relating to all of the characters of the text independent of the script or language of the text, wherein the data is a function of a single independent variable in one direction, said variable being a function of a feature vector representing a plurality of elements, each element being a function of the percentile height below which a prespecified percentage of black pixels lies; and determining from the data each character of text based upon a predetermined probabilistic paradigm. - View Dependent Claims (30, 31)
-
-
32. An optical character recognition system for recognizing each character of a text, comprising:
-
means for (a) determining the position of each line of characters of the text and (b) scanning each line of text; data generating means for generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said data generating means comprising scanning means for scanning said text (a) line by line in the same direction as said line and (b) in successive frames of a predetermined height and width independent of the script and language of said text, but a function of the font size of the characters; means for dividing each of said frames into an array of pixels; means for dividing said frame into a plurality of cells so that said cells are aligned along an axis perpendicular to the direction of scanning said text; means for determining the percentage of said pixels below a predetermined threshold within each of said cells relative to the total number of pixels below a predetermined threshold within said frame; and means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data; wherein said data generating means includes means for generating the data as a function of the intensity in each of said pixels in each of said frames.
-
-
33. An optical character recognition system for recognizing each character of a text, comprising:
-
means for (a) determining the position of each line of characters of the text and (b) scanning each line of text; data generating means for generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said data generating means comprising scanning means for scanning said text (a) line by line in the same direction as said line, (b) in successive frames of a predetermined height and width independent of the script and language of said text, but a function of the font size of the characters, (c) so that the frames overlap and (d) each of said frames is divided into a plurality of cells; and means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data, wherein said means for generating data representing optically scanned text as a function of a single independent variable also includes means for generating the data as a function of the derivative across the dimensions of a feature vector. - View Dependent Claims (34)
-
-
35. An optical character recognition system for recognizing each character of a text, comprising:
-
means for (a) determining the position of each line of characters of the text and (b) scanning each line of text; data generating means for generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said data generating means comprising scanning means for scanning said text (a) line by line in the same direction as said line, (b) in successive frames of a predetermined height and width independent of the script and language of said text, but a function of the font size of the characters, and (c) so that the frames overlap; and means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data, wherein said data generating means includes means for generating the data as a function of the derivative of each feature across adjacent frames.
-
-
36. An optical character recognition system for recognizing each character of a text, comprising:
-
means for (a) determining the position of each line of characters of the text and (b) scanning each line of text; data generating means for generating data representing all of the optically scanned characters of the text as a function of a single independent variable in one direction, said data generating means comprising scanning means for scanning said text (a) line by line in the same direction as said line, (b) in successive frames of a predetermined height and width independent of the script and language of said text, but a function of the font size of the characters, (c) so that the frames overlap, and (d) so that each frame is divided into a plurality of cells so that said cells are aligned along an axis perpendicular to the direction of scanning said text; and means for providing a probabilistic paradigm so as to determine each of said characters of said text from said data; wherein said data generating means includes means for generating the data as a function of the local slope and correlation across a window comprising a plurality of said cells.
-
Specification