Database for mixed media document system
First Claim
1. A database system for providing mixed media documents, comprising:
- one or more processors;
an index table, stored on a memory and accessible by the one or more processors, that stores electronic descriptions of features extracted from paper documents, wherein the features include word bounding boxes, feature location information for the features, and association information for each of the paper documents and locations with a mixed media document that combines printed and digital media;
a feature extraction module, stored on the memory and executable by the one or more processors to;
receive an image patch;
determine word bounding boxes from the image patch by aligning the image patch with a horizontal axis, detecting text lines in the image patch based on the aligned image patch, locating an area within each text line that is above a threshold as a word, and identifying the bounding boxes for words within the text lines;
generate a query from the image patch, at least one query term of the query comprising a two-dimensional geometric relationship between the word bounding boxes determined from the image patch, the two-dimensional geometric relationship specifying one or more of a direction, an angle, a distance between the word bounding boxes determined from the image patch, and geometric shape and contour of the word bounding boxes; and
an accumulator module, stored on the memory and executable by the one or more processors to;
locate at least one mixed media document that contains the word bounding boxes determined from the image patch; and
determine that the at least one mixed media document is a potential match to the query based on determining a two-dimensional geometric relationship between the features stored in the index table, comparing the two-dimensional geometric relationship between the word bounding boxes determined from the image patch with the two-dimensional geometric relationship between the features stored in the index table, computing a matching score for the at least one mixed media document, and returning the at least one mixed media document as a match to the query if the matching score is above a threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A Mixed Media Reality (MMR) system and associated techniques are disclosed. The MMR system provides mechanisms for forming a mixed media document that includes media of at least two types (e.g., printed paper as a first medium and digital content and/or web link as a second medium). In one particular embodiment, the MMR system includes a content-based retrieval database configured with an index table to represent two-dimensional geometric relationships between objects extracted from a printed document in a way that allows look-up using a text-based index. A ranked set of document, page and location hypotheses can be computed given data from the index table. The techniques effectively transform features detected in an image patch into textual terms (or other searchable features) that represent both the features themselves and the geometric relationship between them. A storage facility can be used to store additional characteristics about each document image patch.
544 Citations
30 Claims
-
1. A database system for providing mixed media documents, comprising:
-
one or more processors; an index table, stored on a memory and accessible by the one or more processors, that stores electronic descriptions of features extracted from paper documents, wherein the features include word bounding boxes, feature location information for the features, and association information for each of the paper documents and locations with a mixed media document that combines printed and digital media; a feature extraction module, stored on the memory and executable by the one or more processors to; receive an image patch; determine word bounding boxes from the image patch by aligning the image patch with a horizontal axis, detecting text lines in the image patch based on the aligned image patch, locating an area within each text line that is above a threshold as a word, and identifying the bounding boxes for words within the text lines; generate a query from the image patch, at least one query term of the query comprising a two-dimensional geometric relationship between the word bounding boxes determined from the image patch, the two-dimensional geometric relationship specifying one or more of a direction, an angle, a distance between the word bounding boxes determined from the image patch, and geometric shape and contour of the word bounding boxes; and an accumulator module, stored on the memory and executable by the one or more processors to; locate at least one mixed media document that contains the word bounding boxes determined from the image patch; and determine that the at least one mixed media document is a potential match to the query based on determining a two-dimensional geometric relationship between the features stored in the index table, comparing the two-dimensional geometric relationship between the word bounding boxes determined from the image patch with the two-dimensional geometric relationship between the features stored in the index table, computing a matching score for the at least one mixed media document, and returning the at least one mixed media document as a match to the query if the matching score is above a threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method for providing mixed media documents, comprising:
-
storing, with one or more processors, at an index table, electronic descriptions of features extracted from paper documents and feature location information for the features, wherein the features include word bounding boxes, the index table associating each of the paper documents and locations with a mixed media document that combines printed and digital media; receiving an image patch; determining word bounding boxes from the image patch by aligning the image patch with a horizontal axis, detecting text lines in the image patch based on the aligned image patch, locating an area within each text line that is above a threshold as a word, and identifying the bounding boxes for words within the text lines; generating, with the one or more processors, a query from the image patch, at least one query term of the query comprising a two-dimensional geometric relationship between the word bounding boxes determined from the image patch, the two-dimensional geometric relationship specifying one or more of a direction, an angle, a distance between the word bounding boxes determined from the image patch, and geometric shape and contour of the word bounding boxes; locating, with the one or more processors, at least one mixed media document that contains the word bounding boxes determined from the image patch; and determining that the at least one mixed media document is a potential match to the query based on determining a two-dimensional geometric relationship between the features stored in the index table, comparing the two-dimensional geometric relationship between the word bounding boxes determined from the image patch with the two-dimensional geometric relationship between the features stored in the index table, computing a matching score for the at least one mixed media document, and returning the at least one mixed media document as a match to the query if the matching score is above a threshold. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A non-transitory machine-readable medium encoded with instructions, that when executed by one or more processors, cause the one or more processors to carry out a process for providing mixed media documents, the process comprising:
-
storing, at an index table, electronic descriptions of features extracted from paper documents and feature location information for the features, wherein the features include word bounding boxes, the index table associating each of the paper documents and locations with a mixed media document that combines printed and digital media; receiving an image patch; determining word bounding boxes from the image patch by aligning the image patch with a horizontal axis, detecting text lines in the image patch based on the aligned image patch, locating an area within each text line that is above a threshold as a word, and identifying the bounding boxes for words within the text lines; generating a query from the image patch, at least one query term of the query comprising a two-dimensional geometric relationship between the word bounding boxes determined from the image patch, the two-dimensional geometric relationship specifying one or more of a direction, an angle, a distance between the word bounding boxes determined from the image patch, and geometric shape and contour of the word bounding boxes; locating at least one mixed media document that contains the word bounding boxes determined from the image patch; and determining that the at least one mixed media document is a potential match to the query based on determining a two-dimensional geometric relationship between the features stored in the index table, comparing the two-dimensional geometric relationship between the word bounding boxes determined from the image patch with the two-dimensional geometric relationship between the features stored in the index table, computing a matching score for the at least one mixed media document, and returning the at least one mixed media document as a match to the query if the matching score is above a threshold. - View Dependent Claims (25, 26, 27, 28, 29, 30)
-
Specification