Detecting and utilizing add-on information from a scanned document image
First Claim
1. A method far detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
- generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines; and
discriminating the printed text lines from the handwritten annotations by comparing the merged text lines to the regular pattern of the projection histograms.
1 Assignment
0 Petitions
Accused Products
Abstract
A scanned document image, including add-on information such as handwritten annotations in addition to printed text lines, is processed by a handwriting detection method. First, at least one projection histogram is generated from the scanned document image. A regular pattern that correlates to the printed text lines is determined from the projection histogram. Second, connected component analysis is applied to the scanned document image to generate at least one merged text line. Each merged text line relates to at least one of the handwritten annotation and the printed text line. By comparing the merged text lines to the regular pattern of the projection histograms, the printed text lines are discriminated from the handwritten annotations.
-
Citations
17 Claims
-
1. A method far detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
-
generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines; and
discriminating the printed text lines from the handwritten annotations by comparing the merged text lines to the regular pattern of the projection histograms. - View Dependent Claims (2)
-
-
3. A method for detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
-
generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines; and
discriminating the printed text lines from the handwritten annotations by comparing the merged text lines to the regular pattern of the projection histograms;
wherein the regular pattern is margins of the printed text lines.
-
-
4. A method for detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
-
generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines;
comparing the merged text lines to the regular pattern of the projection histograms, thereby discriminating the printed text lines from the handwritten annotations; and
detecting text line peaks from the projection histogram of the scanned document image.
-
-
5. A method for detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
-
generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines; and
comparing the merged text lines to the regular pattern of the projection histograms, thereby discriminating the printed text lines from the handwritten annotations;
wherein the step of applying connected component analysis further comprising steps of;
generating connected components onto the scanned document image by connecting dark pixels that are in relation to each other;
generating bounding boxes of the connected components; and
line merging the bounding boxes that are within a same text line in order to generate the merged text line. - View Dependent Claims (6, 7, 8)
-
-
9. A method for detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one printed text line, the method comprising the steps of:
-
generating at least one projection histogram for the scanned document image, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document image in order to generate at least one merged text line, wherein each of the merged text lines correlate to at least one of the handwritten annotations and the printed text lines;
comparing the merged text lines to the regular pattern of the projection histograms, thereby discriminating the printed text lines from the handwritten annotations; and
separating the printed text lines and the handwritten annotations by comparing the size of the bounding box to the size of the printed text lines.
-
-
10. A method for detecting handwritten annotations from a scanned document image, the scanned document image having a handwritten annotation and at least one primed text line, the method comprising the steps of:
-
generating vertical and horizontal projection histograms for the scanned document image, wherein the projection histogram includes margins that correlate to the printed text lines;
generating connected components by connecting dark pixels that are in association with the others on the scanned document image;
generating bounding boxes that encapsulates all of the connected components that are in relation with each other;
line merging the bounding boxes that are within a same text line in order to generate at least one merged text line, wherein each of the merged text lines correlates to at least one of the handwritten annotations and the printed text lines;
detecting text line peaks from the horizontal projection histogram of the scanned document image; and
comparing the merged text lines to the margins determined from the vertical and horizontal projection histograms, thereby discriminating the printed text lines from the handwritten annotation.
-
-
11. A method for recording a history of a document comprising the steps of;
-
scanning a document having printed text lines and handwritten annotations;
separating the handwritten annotations from the printed text lines of the scanned document;
wherein the step of separating further comprises;
generating at least one projection histogram for the scanned document, wherein the projection histogram includes a regular pattern that correlates to the printed text lines;
applying connected component analysis to the scanned document in order to generate at least one merged text line, wherein each merged text line relates to at least one of the handwritten annotations and the printed text lines; and
comparing the merged text lines to the regular pattern of the projection histograms, thereby discriminating the printed text lines from the handwritten annotations;
comparing the scanned document with an original document wherein the original document includes only the printed text lines;
determining an existence of previous versions of the scanned document; and
recording a history of the scanned document based on the separated handwritten annotations. - View Dependent Claims (12, 13, 14)
-
-
15. A method for securing transmission of an original version of a document, the document having the original version and at least one secondary version, wherein the original version includes only printed text lines and the secondary version includes the printed text lines and handwritten annotations, the method comprising the steps of:
-
separating the printed text lines train the handwritten annotations in at least one secondary version of the document by using a histogram to identify regions containing handwritten annotations;
wherein said separating step further comprising the steps of;
generating vertical and horizontal projection histograms for the scanned document image, wherein the projection histogram includes margins that correlate to the printed text lines;
generating connected components by connecting clark pixels that are in association with the others on the scanned document image;
generating bounding boxes that encapsulates all of the connected components that are in relation with each other;
line merging the bounding boxes that are within a same text line in order to generate at least one merged text line, wherein each of the merged text lines correlates to at least one of the handwritten annotations and the printed text lines;
detecting text line peaks from the horizontal projection histogram of the scanned document image; and
discriminating the printed text lines from the handwritten annotations by comparing the merged text lines to the regular pattern of the projection histograms;
determining the original version of the document in relation with the printed text lines separated from the handwritten annotations; and
transmitting the printed text lines after the step of determining the original version of the document. - View Dependent Claims (16)
-
-
17. An efficient compression method wherein an original version of a document is stored in a database, the document having at least one secondary version which includes printed text lines and handwritten annotations, the method comprising the steps of:
-
separating the printed text lines from the handwritten annotations in at least one secondary version of the document by using a histogram to identify regions containing handwritten annotations, wherein said step further comprising of the steps of;
generating vertical and horizontal projection histograms for the scanned document image, wherein the projection histogram includes margins that correlate to the printed text lines;
generating connected components by connecting dark pixels that are in association with the others on the scanned document image;
generating bounding boxes that encapsulates all of the connected components that are in relation with each other;
line merging the bounding boxes that are within a same text line in order to generate at least one merged text line, wherein each of the merged text lines correlates to at least one of the handwritten annotations and the printed text lines;
detecting text line peaks from the horizontal projection histogram of the scanned document image; and
comparing the merged text lines to the margins determined from the vertical and horizontal projection histograms, thereby discriminating the printed text lines from the handwritten annotations;
comparing the printed text lines with the original documents; and
storing only the handwritten annotations in association with the original document.
-
Specification