Mixed media reality indexing and retrieval for repeated content
First Claim
1. An apparatus for use in recognizing documents from an image patch, the apparatus comprising:
- an image recognition unit having an input and an output, the input of the image recognition unit coupled to receive the image patch, the image recognition unit extracting features from the image patch;
a hierarchical mixed media reality (MMR) index coupled to and communicating with the image recognition unit, the hierarchical MMR index being a tree with a first set of nodes, wherein each node is a flat MMR index, the hierarchical MMR index used to determine document corresponding to the extracted features from a database of documents; and
a hierarchical shared content index coupled to the image recognition unit, the hierarchical shared content index having a hierarchical description of document pages, the hierarchical shared content index used to produce matching documents from the database of documents by traversing the hierarchical shared content index.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for indexing and retrieval of document images in an MMR system having repeated content is described. The system provides one or more hierarchical shared content indices that produce faster and/or more accurate search results. The system is also advantageous because the number and configuration of the hierarchical shared content indices is automated, scalable and efficient for processing documents with partially repeated content. In particular, the MMR matching unit includes a hierarchical shared content index (HSCI) and associated methods of use for processing images where the MMR system includes repeated content. The present invention also includes a number of novel methods including a method for adding an image to a hierarchical shared content index; a method for deleting an image from the hierarchical shared content index, and a method for using the hierarchical shared content index for image recognition, as well as a method for combining multiple MMR indexes into a hierarchical MMR index.
392 Citations
20 Claims
-
1. An apparatus for use in recognizing documents from an image patch, the apparatus comprising:
-
an image recognition unit having an input and an output, the input of the image recognition unit coupled to receive the image patch, the image recognition unit extracting features from the image patch; a hierarchical mixed media reality (MMR) index coupled to and communicating with the image recognition unit, the hierarchical MMR index being a tree with a first set of nodes, wherein each node is a flat MMR index, the hierarchical MMR index used to determine document corresponding to the extracted features from a database of documents; and a hierarchical shared content index coupled to the image recognition unit, the hierarchical shared content index having a hierarchical description of document pages, the hierarchical shared content index used to produce matching documents from the database of documents by traversing the hierarchical shared content index. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer implemented method for recognizing documents from an image patch, the method comprising:
-
receiving, with one or more processors, the image patch; determining, with the one or more processors, one or more features of the image patch; determining, with the one or more processors, the document from a database of documents corresponding to the one or more features; determining, with the one or more processors, an internal node in a hierarchical index corresponding to the document, the hierarchical index being a directed acyclic graph in which leaf nodes represent actual document pages that are indexed while internal nodes represent shared content among the actual document pages; and producing, with the one or more processors, one or more matching documents from the database of documents by traversing the hierarchical index. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer program product comprising a computer usable medium including a non-transitory computer readable program, wherein the computer readable program when executed on a computer causes the computer to:
-
receive an image patch; determine one or more features of the image patch; determine a document from a database of documents corresponding to the one or more features; determine an internal node in a hierarchical index corresponding to the document, the hierarchical index being a directed acyclic graph in which leaf nodes represent actual document pages that are indexed while internal nodes represent shared content among the actual document pages; and produce one or more matching documents from the database of documents by traversing the hierarchical index. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification