Method and apparatus for text detection
First Claim
Patent Images
1. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
- detecting a local ramp;
identifying an intensity trough using a finite state machine algorithm, the algorithm having a sweeping procedure that sweeps the scanned page from left to right for detecting vertical troughs, and from top to bottom for detecting horizontal troughs;
determining a stroke width; and
contrast-based text detection processing;
wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and
whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected.
1 Assignment
0 Petitions
Accused Products
Abstract
A text detection technique comprises local ramp detection, identification of intensity troughs (candidate text strokes), determination of stroke width, preliminary detection of text based on contrast and stroke width, and a consistency check.
-
Citations
22 Claims
-
1. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
-
detecting a local ramp; identifying an intensity trough using a finite state machine algorithm, the algorithm having a sweeping procedure that sweeps the scanned page from left to right for detecting vertical troughs, and from top to bottom for detecting horizontal troughs; determining a stroke width; and contrast-based text detection processing; wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
-
detecting a local ramp; identifying an intensity trough; determining a stroke width by; determining a width and a skeleton, wherein the width is a distance value and the skeleton is a skeletal line; and detecting closely touching text strokes; and contrast-based text detection processing; wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected. - View Dependent Claims (9, 10)
-
-
11. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
-
pre-processing for stroke width determination; and contrast-based text detection processing by; preliminarily detecting text based on local contrast and stroke width; deciding whether a current pixel is a text pixel by using the local contrast present in an N×
N window having a center over a set of pixels and centered at the current pixel, and stroke width at the current pixel; andconsistency checking; wherein the localized region comprises a substantially sharp edge between the dark side and the light side, and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
-
pre-processing for stroke width determination; and
contrast-based text detection processing by;detecting text preliminarily based on local contrast and stroke width; and consistency checking by; accumulating a set of statistics using an N×
N window of text tags and a set of thresholds; anddeciding by using the set of statistics if each of the text tags is any of; Text Outline; Text Body; Background; and Non-text; wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected. - View Dependent Claims (20, 21, 22)
-
Specification