Methods and Systems for Detecting Numerals in a Digital Image
First Claim
Patent Images
1. A method for detecting a numeral connected component in a digital image, said method comprising:
- a) receiving a text-line component, wherein said text-line component comprises a plurality of connected components in a digital image;
b) determining a component bounding box for each of said plurality of connected components, wherein each component bounding box comprises a first-side coordinate, a second-side coordinate, a third-side coordinate and a fourth-side coordinate, wherein said first-side coordinate and said second-side coordinate are associated with a first axis of said bounding box and said third-side coordinate and said fourth-side coordinate are associated with a second axis of said bounding box;
c) determining a first variability measure associated with said first-side coordinates;
d) determining a second variability measure associated with said second-side coordinates;
e) determining a third variability measure associated with said third-side coordinates;
f) determining a fourth variability measure associated with said fourth-side coordinates;
g) determining a first accumulation of said first variability measure and said second variability measure;
h) determining a second accumulation of said third variability measure and said fourth variability measure; and
i) when said first accumulation and said second accumulation meet a first criterion;
i) classifying said text-line component as a numeral component when said first variability measure meets a first threshold criterion and said second variability measure meets a second threshold criterion; and
ii) classifying said text-line component as a non-numeral component when either said first variability measure does not meet said first threshold criterion or said second variability measure does not meet said second threshold criterion; and
j) when said first accumulation and said second accumulation do not meet said first criterion;
i) classifying said text-line component as a numeral component when said third variability measure meets a third threshold criterion and said fourth variability measure meets a fourth threshold criterion; and
ii) classifying said text-line component as a non-numeral component when either said third variability measure does not meet said third threshold criterion or said fourth variability measure does not meet said fourth threshold criterion.
1 Assignment
0 Petitions
Accused Products
Abstract
Aspects of the present invention are related to systems and methods for determining the location of numerals in an electronic document image.
-
Citations
21 Claims
-
1. A method for detecting a numeral connected component in a digital image, said method comprising:
-
a) receiving a text-line component, wherein said text-line component comprises a plurality of connected components in a digital image; b) determining a component bounding box for each of said plurality of connected components, wherein each component bounding box comprises a first-side coordinate, a second-side coordinate, a third-side coordinate and a fourth-side coordinate, wherein said first-side coordinate and said second-side coordinate are associated with a first axis of said bounding box and said third-side coordinate and said fourth-side coordinate are associated with a second axis of said bounding box; c) determining a first variability measure associated with said first-side coordinates; d) determining a second variability measure associated with said second-side coordinates; e) determining a third variability measure associated with said third-side coordinates; f) determining a fourth variability measure associated with said fourth-side coordinates; g) determining a first accumulation of said first variability measure and said second variability measure; h) determining a second accumulation of said third variability measure and said fourth variability measure; and i) when said first accumulation and said second accumulation meet a first criterion; i) classifying said text-line component as a numeral component when said first variability measure meets a first threshold criterion and said second variability measure meets a second threshold criterion; and ii) classifying said text-line component as a non-numeral component when either said first variability measure does not meet said first threshold criterion or said second variability measure does not meet said second threshold criterion; and j) when said first accumulation and said second accumulation do not meet said first criterion; i) classifying said text-line component as a numeral component when said third variability measure meets a third threshold criterion and said fourth variability measure meets a fourth threshold criterion; and ii) classifying said text-line component as a non-numeral component when either said third variability measure does not meet said third threshold criterion or said fourth variability measure does not meet said fourth threshold criterion. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for detecting a numeral connected component in a digital image, said method comprising:
-
a) receiving a text-line component, wherein said text-line component comprises a plurality of connected components in a digital image; b) calculating an aspect ratio for each of said connected components in said plurality of connected components, thereby producing a plurality of aspect ratios; c) calculating a first characteristic of said plurality of aspect ratios; d) classifying said text-line component as a numeral component when said first characteristic meets a first criterion; and e) classifying said text-line component as a non-numeral component when said first characteristic does not meet said first criterion. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for detecting a numeral connected component in a digital image, said method comprising:
-
a) receiving a text-line component, wherein said text-line component comprises a plurality of connected components in a digital image; b) determining the number of connected components in said plurality of connected components; c) classifying said text-line component as a numeral component when said number of connected components meets a quantity criterion; and d) classifying said text-line component as a non-numeral component when said number of connected components does not meet said quantity criterion.
-
Specification