SEARCHING A REPOSITORY OF DOCUMENTS USING A SOURCE IMAGE AS A QUERY
First Claim
1. A method for searching a repository of documents using an image as a search query, the method comprising:
- receiving a query image I* to be used to search a repository of documents;
hashing said query image to obtain a hash value h(I*) for said query image;
setting an initial threshold level τ
0 for said search;
performing a no-reference quality assessment on said query image to estimate of a measure of quality Q(I*) for said query image e;
adjusting said initial threshold level by said measure of quality, said adjusted threshold defining a possible match for said search;
for each current document in said repository;
for each current image in said document;
extracting said current image I(k) from said current document;
hashing said extracted current image to obtain a hash value h(I(k) for said extracted current image;
computing a distance between said hash value h(l*) of said query image and said hash value h(I(k) of said extracted current image;
comparing said computed distance against said adjusted threshold; and
flagging said current document as a possible match in response to said computed distance being less than said adjusted threshold; and
providing any of said flagged documents to a user in response to said search.
7 Assignments
0 Petitions
Accused Products
Abstract
What is disclosed is a system and method for searching a repository of documents containing images using an image as a query. The present method enables the adjustment of a threshold level through a no-reference quality assessment of the query image which produces an estimated measure of quality for the image. For each image in each document in the repository, a distance is computed between a hash value of each image extracted from the document and the hash value of the query image. Documents are flagged as possible matches if the computed distance is less than the adjusted threshold. Documents flagged as a result of the search are retrieved and provided to the user. The present method can be used along or as an adjunct to text-based search techniques. Other embodiments are provided.
-
Citations
20 Claims
-
1. A method for searching a repository of documents using an image as a search query, the method comprising:
-
receiving a query image I* to be used to search a repository of documents; hashing said query image to obtain a hash value h(I*) for said query image; setting an initial threshold level τ
0 for said search;performing a no-reference quality assessment on said query image to estimate of a measure of quality Q(I*) for said query image e; adjusting said initial threshold level by said measure of quality, said adjusted threshold defining a possible match for said search; for each current document in said repository; for each current image in said document; extracting said current image I(k) from said current document; hashing said extracted current image to obtain a hash value h(I(k) for said extracted current image; computing a distance between said hash value h(l*) of said query image and said hash value h(I(k) of said extracted current image; comparing said computed distance against said adjusted threshold; and flagging said current document as a possible match in response to said computed distance being less than said adjusted threshold; and providing any of said flagged documents to a user in response to said search. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system for searching a repository of documents using an image as a search query, the system comprising:
-
a display device and a user interface for entering an input; a memory for storing machine executable instructions; a storage device containing a repository of searchable document, at least a portion of said documents contain extractable images; and a processor in communication with said display and user interface, said memory and said storage device, said processor executing said machine readable instructions for performing; receiving a query image I* to be used to search a repository of documents; hashing said query image to obtain a hash value h(I*) for said query image; setting an initial threshold level τ
0 for said search;performing a no-reference quality assessment on said query image to estimate of a measure of quality Q(I*) for said query image; adjusting said initial threshold level by said measure of quality, said adjusted threshold defining a possible match for said search; for each current document in said repository; for each current image in said document; extracting said current image I(k) from said current document; hashing said extracted current image to obtain a hash value h(I(k) for said extracted current image; computing a distance between said hash value h(I*) of said query image and said hash value h(I(k) of said extracted current image; comparing said computed distance against said adjusted threshold; and flagging said current document as a possible match in response to said computed distance being less than said adjusted threshold; and providing any of said flagged documents to a user in response to said search. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for decoding data embedded in a color barcode, the computer program product comprising:
a computer-usable data carrier storing instructions that, when executed on a computer, cause the computer to perform a method comprising; receiving a query image I* to be used to search a repository of documents; hashing said query image to obtain a hash value h(I*) for said query image; setting an initial threshold level τ
0 for said search;performing a no-reference quality assessment on said query image to estimate of a measure of quality Q(I*) for said query image; adjusting said initial threshold level by said measure of quality, said adjusted threshold defining a possible match for said search; for each current document in said repository; for each current image in said document; extracting said current image I(k) from said current document; hashing said extracted current image to obtain a hash value h(I(k) for said extracted current image; computing a distance between said hash value h(I*) of said query image and said hash value h(I(k) of said extracted current image; comparing said computed distance against said adjusted threshold; and flagging said current document as a possible match in response to said computed distance being less than said adjusted threshold; and providing any of said flagged documents to a user in response to said search. - View Dependent Claims (16, 17, 18, 19, 20)
Specification