Image-based document indexing and retrieval
First Claim
Patent Images
1. A system for document retrieval and/or indexing comprising:
- a component that receives a captured image of at least a portion of a physical document; and
a search component that locates a match to the document, the search is performed over word-level topological properties of generated images, the generated images being images of at least a portion of one or more electronic documents.
3 Assignments
0 Petitions
Accused Products
Abstract
A system that facilitates document retrieval and/or indexing is provided. A component receives an image of a document, and a search component searches data store(s) for a match to the document image. The match is performed over word-level topological properties of images of documents stored in the data store(s).
254 Citations
42 Claims
-
1. A system for document retrieval and/or indexing comprising:
-
a component that receives a captured image of at least a portion of a physical document; and
a search component that locates a match to the document, the search is performed over word-level topological properties of generated images, the generated images being images of at least a portion of one or more electronic documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 41)
-
-
24. A method that facilitates indexing and/or retrieval of a document, comprising:
-
generating a plurality of images of electronic documents, at least one of the images of electronic documents corresponding to a printed document;
capturing an image of a printed document after such document has been printed;
receiving a query requesting retrieval of an electronic document corresponding to the image of the printed document;
generating one or more signatures corresponding to at least a portion of one or more of the generated images, the signatures generated at least in part upon word-layout within the image(s);
generating a signature corresponding to at least a portion of the captured image, the signature generated at least in part upon word-layout within the captured image; and
comparing the one or more signatures corresponding to the one or more generated images to the signature corresponding to the captured image.
-
-
25. A method that facilitates indexing and/or retrieval of a document, comprising:
-
receiving a captured image of at least a portion of a document; and
searching data store(s) for an electronic document corresponding to the captured image, the search performed via comparing topological word properties within the captured image with topological word properties of generated images corresponding to a plurality of electronic documents. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32)
-
-
33. A system for indexing and/or retrieval of a document, comprising:
-
means for generating an image of an electronic document when the electronic document is printed;
means for capturing an image of the document after the document has been printed;
means for retrieving the electronic document, the means based at least in part upon comparing location and width of words within the captured image to the location and width of words within the generated image. - View Dependent Claims (34, 35, 36, 37)
-
-
38. A system that facilitates indexing and/or retrieval of a document, comprising:
-
a query component that receives an image of a printed document;
a caching component that generates and stores an image corresponding to the image of the document prior to the query component receiving the image of the printed document; and
a comparison component that retrieves the stored image via comparing at least one of location and width of words within the stored image to location and width of words within the image of the printed document.
-
-
39. A computer readable medium having computer executable instructions stored thereon to return stored image(s) of an electronic document to a user based at least in part upon topological word properties of captured image(s) corresponding to the printed document.
-
40. A computer readable medium having a data structure thereon, the data structure comprising:
-
a component that receives image(s) of at least a portion of a printed document; and
a search component that facilitates retrieval of an electronic document, the electronic document corresponding to the image(s) of the printed document, the retrieval based at least in part upon similar word-level topological properties when comparing the image(s) of the printed document and generated image(s) of the electronic document.
-
-
42. A signal having one or more data packets that facilitate indexing and/or retrieval of a document, comprising:
-
a request for retrieval of a stored image of at least a portion of an electronic document;
a signature of an electronic image of a printed document corresponding to a signature of the images of the requested stored electronic document, the signatures based at least in part upon word layout of the images; and
a component that facilitates comparison of the signature of the image of the printed document with the signature of the image of the requested stored document.
-
Specification