Methods and systems for text detection in mixed-context documents using local geometric signatures
First Claim
Patent Images
1. A method for detecting text in a mixed-content image comprising:
- processing said image to identify edge pixels associated with significant intensity changes;
processing said image to identify an intensity gradient direction for each of said edge pixels;
processing said image to identify one of a ridge or a valley pixel having coincident curvature wherein a maximum curvature of an intensity map, centered on a subject pixel occurs at the same location as a minimum curvature of said intensity map;
when said coincident curvature position exists, identifying said subject pixel as one of a ridge or a valley pixel;
measuring the proximity of said one of a ridge or a valley pixel to said edge pixel; and
identifying said edge pixel as a text edge pixel when said proximity conforms to specified proximity criteria.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present invention relate to methods and systems for detection and delineation of text characters in images which may contain combinations of text and graphical content. Embodiments of the present invention employ intensity contrast edge detection methods and intensity gradient direction determination methods in conjunction with analyses of intensity curve geometry to determine the presence of text and verify text edge identification. These methods may be used to identify text in mixed-content images, to determine text character edges and to achieve other image processing purposes.
131 Citations
7 Claims
-
1. A method for detecting text in a mixed-content image comprising:
-
processing said image to identify edge pixels associated with significant intensity changes; processing said image to identify an intensity gradient direction for each of said edge pixels; processing said image to identify one of a ridge or a valley pixel having coincident curvature wherein a maximum curvature of an intensity map, centered on a subject pixel occurs at the same location as a minimum curvature of said intensity map; when said coincident curvature position exists, identifying said subject pixel as one of a ridge or a valley pixel; measuring the proximity of said one of a ridge or a valley pixel to said edge pixel; and identifying said edge pixel as a text edge pixel when said proximity conforms to specified proximity criteria.
-
-
2. A method for detecting text in a mixed-content image, said method comprising:
-
identifying an edge associated with a high-contrast intensity change; identifying an intensity gradient direction for said edge; identifying a character stroke axis, wherein said axis is an element in the group consisting of a stroke valley or a stroke ridge; wherein said identifying comprises an analysis of image components until the change in curvature of the intensify curve between two successive image components in a direction substantially parallel to the intensity gradient direction reaches a maximum absolute value at the same position that the change in curvature of the intensity curve in a direction substantially perpendicular to the intensity gradient direction is near zero; wherein said curvature of the intensity curve is calculated by solving for the eigenvalues of a Hessian matrix; measuring a distance, in the intensity gradient direction, between said axis and said edge; and identifying said edge as a text edge when said distance is less than a threshold value.
-
-
3. A method for detecting text in a mixed-content image, said method comprising:
-
identifying an edge associated with a high-contrast intensity change; identifying an intensity gradient direction for said change; identifying a character stroke axis, wherein said axis is selected from the group consisting of a stroke valley and a stroke ridge, comprising the acts of;
(1) analyzing successive pixels to identify a coincident curvature position wherein a substantial curvature of an intensity map occurs at the same location as a minimal curvature of said intensity map; and
(2) measuring a substantially transverse distance between said axis and said edge; andidentifying said edge as a text edge when said substantially transverse distance is less than a threshold value.
-
-
4. A method for detecting text in a mixed-content image comprising:
-
processing said image to identify edge components associated with significant intensity changes; processing said image to identify an intensity gradient direction for each of said edge components; processing said image to identify character stroke axes, wherein said stroke axes are one of a stroke valley or a stroke ridge, comprising the step of analyzing successive pixels to identify a coincident curvature position wherein a maximum curvature of an intensity map, said maximum curvature being greater than a threshold value, occurs at the same location as a minimal curvature of said intensity map, said minimal curves being lower than a specified value; measuring the proximity of said axes to said edge component; and identifying said edge component as a text edge component when said proximity conforms to specified proximity criteria.
-
-
5. A computer readable medium for detecting text in a mixed-content image, said method comprising the acts of:
-
identifying an image edge component of an edge associated with a high-contrast intensity change in an image; identifying an intensity gradient direction for said edge component; identifying a geometric intensity curvature feature consisting of a ridge or a valley, where said identifying a geometric intensity curvature comprises an analysis of image components until the change in curvature of the intensity curve between two successive image components in a direction substantially parallel to the intensity gradient direction reaches a maximum absolute value at the same position that the change in curvature of the intensity curve in a direction substantially perpendicular to the intensity gradient direction is near zero; measuring the proximity of said feature to said edge; and identifying said edge component as a text edge component when said proximity conforms to specific proximity criteria.
-
-
6. A method for detecting text in a mixed-content image, said method comprising:
-
identifying an edge associated with a high-contrast intensity change; identifying an intensity gradient direction for said edge; identifying a character stroke axis, wherein said axis is selected from the group consisting of a stroke valley or a stroke ridge; measuring a substantially transverse distance between said axis and said edge; identifying said edge as a text edge when said substantially transverse distance is less than a threshold value; and analyzing successive pixels to identify a coincident curvature position wherein a substantial curvature of an intensity map occurs at the same location as a minimal curvature of said intensity map in another direction.
-
-
7. A method for detecting text in a mixed-content image comprising:
-
processing said image to identify edge components associated with significant intensity changes; processing said image to identify an intensity gradient direction for each of said edge components; processing said image to identify character stroke axes, wherein said axes are one of a stroke valley or a stroke ridge; measuring the proximity of said axes to said edge component; identifying said edge component as a text edge component when said proximity conforms to specified proximity criteria; and analyzing successive pixels to identify a coincident curvature position wherein a maximum curvature of an intensity map, said maximum curvature being greater than a threshold value, occurs at the same location as a minimal curvature of said intensity map, said minimal curvature being lower than a specified value and being in a direction approximately perpendicular to said maximum curvature.
-
Specification