Retrieving electronic documents by converting them to synthetic text
First Claim
1. A method for retrieving an electronic document in response to receipt of an image patch, the method comprising:
- receiving the image patch;
identifying a first two-dimensional structure in the image patch;
identifying a second two-dimensional structure in the image patch;
identifying a third two-dimensional structure in the image patch;
generating a first text string that encodes the first two-dimensional structure into a first one-dimensional structure by specifying a location of the first two-dimensional structure, a first quantized angle measured between a line originating from the first two-dimensional structure and a path joining a center of the first two-dimensional structure to a center of the second two-dimensional structure and a second quantized angle defining a path joining a center of the second two-dimensional structure to a center of the third two-dimensional structure;
searching, from a library, for a second one-dimensional structure that is similar to the first one-dimensional structure of the image patch; and
retrieving the electronic document corresponding to the second one-dimensional structure similar to the first one-dimensional structure of the image patch from a document storage.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relies on the two-dimensional information in documents and encodes two-dimensional structures into a one-dimensional synthetic language such that two-dimensional documents can be searched at text search speed. The system comprises: an indexing module, a retrieval module, an encoder, a quantization module, a retrieval engine and a control module coupled by a bus. Electronic documents are first indexed by the indexing module and stored as a synthetic text library. The retrieval module then converts an input image to synthetic text and searches for matches to the synthetic text in the synthetic text library. The matches can be in turn used to retrieve the corresponding electronic documents. In one or more embodiments, the present invention includes a method for comparing the synthetic text to documents that have been converted to synthetic text for a match.
-
Citations
20 Claims
-
1. A method for retrieving an electronic document in response to receipt of an image patch, the method comprising:
-
receiving the image patch; identifying a first two-dimensional structure in the image patch; identifying a second two-dimensional structure in the image patch; identifying a third two-dimensional structure in the image patch; generating a first text string that encodes the first two-dimensional structure into a first one-dimensional structure by specifying a location of the first two-dimensional structure, a first quantized angle measured between a line originating from the first two-dimensional structure and a path joining a center of the first two-dimensional structure to a center of the second two-dimensional structure and a second quantized angle defining a path joining a center of the second two-dimensional structure to a center of the third two-dimensional structure; searching, from a library, for a second one-dimensional structure that is similar to the first one-dimensional structure of the image patch; and retrieving the electronic document corresponding to the second one-dimensional structure similar to the first one-dimensional structure of the image patch from a document storage. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for retrieving an electronic document in response to receipt of an image patch, the system comprising:
-
a processor; an indexing module stored on a memory and executable by the processor, the indexing module receiving the image patch, identifying a first two-dimensional structure in the image patch, identifying a second two-dimensional structure in the image patch and identifying a third two-dimensional structure in the image patch; an encoder coupled to the indexing module, the encoder generating a first text string that encodes the first two-dimensional structure into a first one-dimensional structure by specifying a location of the first two-dimensional structure, a first quantized angle measured between a line originating from the first two-dimensional structure and a path joining a center of the first two-dimensional structure to a center of the second two-dimensional structure and a second quantized angle defining a path joining a center of the second two-dimensional structure to a center of the third two-dimensional structure; a retrieval engine coupled to the indexing module, the retrieval engine searching, from a library, for a second one-dimensional structure that is similar to the first one-dimensional structure of the image patch; and a retrieval module coupled to the indexing module, the retrieval module retrieving the electronic document corresponding to the second one-dimensional structure similar to the first one-dimensional structure of the image patch from a document storage. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for retrieving an electronic document in response to receipt of an image patch comprising a non-transitory computer usable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to:
-
receive the image patch; identify a first two-dimensional structure in the image patch; identify a second two-dimensional structure in the image patch; identify a third two-dimensional structure in the image patch; generate a first text string that encodes the first two-dimensional structure into a first one-dimensional structure by specifying a location of the first two-dimensional structure, a first quantized angle measured between a line originating from the first two-dimensional structure and a path joining a center of the first two-dimensional structure to a center of the second two-dimensional structure and a second quantized angle defining a path joining a center of the second two-dimensional structure to a center of the third two-dimensional structure; search, from a library, for a second one-dimensional structure that is similar to the first one-dimensional structure of the image patch; and retrieve the electronic document corresponding to the second one-dimensional structure similar to the first one-dimensional structure of the image patch from a document storage. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification