Locally optimized feature space encoding of digital data and retrieval using such encoding
First Claim
1. A method comprising:
- generating, by a computing device, a feature vector for a digital document, the generated feature vector comprising a plurality of features of the digital document;
partitioning, by the computing device, the digital document'"'"'s feature vector into a plurality of feature vector segments, each feature vector segment corresponding to a subspace of a plurality of subspaces of a feature space, each subspace comprising a plurality of cells;
generating, by the computing device, a code vector for the digital document, each code in the digital document'"'"'s code vector corresponding to a respective subspace of the plurality subspaces of the feature space and identifying, using a corresponding feature vector segment, a cell of the respective subspace'"'"'s plurality of cells with which the digital document is most similar relative to other cells of the plurality in the respective subspace; and
making a determination, by the computing device, whether to select the digital document, in response to a digital document request, the determination comprising determining a score for the digital document by comparing the digital document'"'"'s code vector and a code vector associated with the digital document request to determine a number of matches and making the determination using the determined score.
6 Assignments
0 Petitions
Accused Products
Abstract
A digital document is represented as a set of codes comprising indices into a feature space comprising a number of subspaces, each code corresponds to one subspace and identifying a cell within the subspace. Each digital document can be represented by a code set, and the code set can be used as selection criteria for identifying a number of digital documents using each digital document'"'"'s corresponding code set. By way of some non-limiting examples, digital document code sets can be used to identify similar or different digital images, used to identify duplicate or nearly-duplicate digital images, used to identify similar and/or different digital images for inclusion in a recommendation, used to identify and rank digital images in a set of search results.
-
Citations
24 Claims
-
1. A method comprising:
-
generating, by a computing device, a feature vector for a digital document, the generated feature vector comprising a plurality of features of the digital document; partitioning, by the computing device, the digital document'"'"'s feature vector into a plurality of feature vector segments, each feature vector segment corresponding to a subspace of a plurality of subspaces of a feature space, each subspace comprising a plurality of cells; generating, by the computing device, a code vector for the digital document, each code in the digital document'"'"'s code vector corresponding to a respective subspace of the plurality subspaces of the feature space and identifying, using a corresponding feature vector segment, a cell of the respective subspace'"'"'s plurality of cells with which the digital document is most similar relative to other cells of the plurality in the respective subspace; and making a determination, by the computing device, whether to select the digital document, in response to a digital document request, the determination comprising determining a score for the digital document by comparing the digital document'"'"'s code vector and a code vector associated with the digital document request to determine a number of matches and making the determination using the determined score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
at least one computing device, each computing device comprising a processor and a storage medium for tangibly storing thereon program logic for execution by the processor, the stored program logic comprising; generating logic executed by the processor for generating a feature vector for a digital document, the generated feature vector comprising a plurality of features of the digital document; partitioning logic executed by the processor for partitioning the digital document'"'"'s feature vector into a plurality of feature vector segments, each feature vector segment corresponding to a subspace of a plurality of subspaces of a feature space, each subspace comprising a plurality of cells; generating logic executed by the processor for generating a code vector for the digital document, each code in the digital document'"'"'s code vector corresponding to a respective subspace of the plurality subspaces of the feature space and identifying, using a corresponding feature vector segment, a cell of the respective subspace'"'"'s plurality of cells with which the digital document is most similar relative to other cells of the plurality in the respective subspace; and making logic executed by the processor for making a determination whether to select the digital document, in response to a digital document request, the determination comprising determining a score for the digital document by comparing the digital document'"'"'s code vector and a code vector associated with the digital document request to determine a number of matches and making the determination using the determined score. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A computer readable non-transitory storage medium for tangibly storing thereon computer readable instructions that when executed cause at least one processor to:
-
generate a feature vector for a digital document, the generated feature vector comprising a plurality of features of the digital document; partition the digital document'"'"'s feature vector into a plurality of feature vector segments, each feature vector segment corresponding to a subspace of a plurality of subspaces of a feature space, each subspace comprising a plurality of cells; generate a code vector for the digital document, each code in the digital document'"'"'s code vector corresponding to a respective subspace of the plurality subspaces of the feature space and identifying, using a corresponding feature vector segment, a cell of the respective subspace'"'"'s plurality of cells with which the digital document is most similar relative to other cells of the plurality in the respective subspace; and make a determination whether to select the digital document, in response to a digital document request, the determination comprising determining a score for the digital document by comparing the digital document'"'"'s code vector and a code vector associated with the digital document request to determine a number of matches and making the determination using the determined score. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification