Method and system for document image feature extraction
First Claim
Patent Images
1. A method for searching a database containing a plurality of document images for a particular document image, said method comprising:
- compressing each document image in said plurality of document images to obtain a compressed representation having a low-pass component and a high-pass component;
extracting image feature information from said compressed representation, wherein said extracting further comprises extracting statistical moments from said low-pass component and extracting connected component information from said high-pass component; and
matching said image feature information with image feature information from said particular document image.
1 Assignment
0 Petitions
Accused Products
Abstract
A set of image feature extraction techniques to locate and group documents based upon appearance in a database management system. The system automatically determines visual characteristics of document images and collects documents together according to the relative similarity of their document images. The system is operable on both binary and grayscale images.
-
Citations
27 Claims
-
1. A method for searching a database containing a plurality of document images for a particular document image, said method comprising:
-
compressing each document image in said plurality of document images to obtain a compressed representation having a low-pass component and a high-pass component; extracting image feature information from said compressed representation, wherein said extracting further comprises extracting statistical moments from said low-pass component and extracting connected component information from said high-pass component; and matching said image feature information with image feature information from said particular document image. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for searching a database containing a plurality of document images in binary representation for a particular document image, comprising the steps of:
-
converting each said document image in binary representation to produce a grayscale representation of said document image; compressing each said grayscale representation of each document image in said plurality of document images to obtain a compressed representation having a low-pass component and a high-pass component; extracting image feature information from said compressed representation, wherein said extracting further comprises extracting statistical moments from said low-pass component and extracting connected components from said high-pass component; and matching said image feature information with image feature information from said particular document image. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A document image database organizing system comprising:
-
an electronic storage unit that stores a document image database; a display that displays document images; and a processor unit coupled to said electronic storage device and said display, said processor unit operative to; compress document images to obtain a compressed representation having a low-pass component and a high-pass component; extract image feature information about example document images, said image feature information comprising statistical moments extracted from said low-pass component and connected components extracted from said high-pass component; and compare said image feature information from a particular document image to said image feature information from said plurality of document images in said database.
-
-
15. A document image database organizing system comprising:
-
an electronic storage unit that stores a document image database; a display that displays document images; and a processor unit coupled to said electronic storage device and said display, said processor unit operative to; convert document image formats; compress document images to obtain a compressed representation having a low-pass component and a high-pass component; extract image feature information about example document images, said image feature information comprising statistical moments extracted from said low-pass component and connected components extracted from said high-pass component; and compare said image feature information from a particular document image to said image feature information from said plurality of document images in said database.
-
-
16. A computer program product comprising:
-
code that compresses document images to obtain a compressed representation having a low-pass component and a high-pass component; code that extracts image feature information from said compressed representation, said code further comprising code that extracts statistical moments from said low-pass component and code that extracts connected components from said high-pass component; code that compares image feature information from a particular document image with image feature information from other document images; and a computer readable storage medium for storing the codes. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A computer program product comprising:
-
code that converts binary represented document images to grayscale document images; code that compresses document images to obtain a compressed representation having a low-pass component and a high-pass component; code that extracts image feature information from said compressed representation, wherein said code that extracts further comprises code that extracts image feature information further comprises code that extracts statistical moments from said low-pass component and code that extracts connected components from said high-pass component; code that compares image feature information from a particular document image with image feature information from other document images; and a computer readable storage medium for storing the codes. - View Dependent Claims (23, 24, 25, 26, 27)
-
Specification