Representation and retrieval of images using content vectors derived from image information elements
First Claim
1. A computer-implemented method of retrieving images, the method comprising:
- receiving a text query;
generating a query context vector derived from a context vector of at least one word included in the text query;
comparing the query context vector with a plurality of summary vectors, each summary vector associated with image features derived from image data of one of the plurality of images; and
retrieving at least one image having a summary vector similar to the query context vector;
wherein comparing the query context vector comprises;
computing a dot product between the query context vector and the summary vector of each of a plurality of images and sorting said each of a plurality of images by said computed dot product whereby images associated with summary vectors producing high dot products are retrieved.
2 Assignments
0 Petitions
Accused Products
Abstract
Image features are generated by performing wavelet transformations at sample points on images stored in electronic form. Multiple wavelet transformations at a point are combined to form an image feature vector. A prototypical set of feature vectors, or atoms, is derived from the set of feature vectors to form an “atomic vocabulary.” The prototypical feature vectors are derived using a vector quantization method (e.g., using neural network self-organization techniques) in which a vector quantization network is also generated. The atomic vocabulary is used to define new images. Meaning is established between atoms in the atomic vocabulary. High-dimensional context vectors are assigned to each atom. The context vectors are then trained as a function of the proximity and co-occurrence of each atom to other atoms in the image. After training, the context vectors associated with the atoms that comprise an image are combined to form a summary vector for the image. Images are retrieved using a number of query methods (e.g., images, image portions, vocabulary atoms, index terms). The user'"'"'s query is converted into a query context vector. A dot product is calculated between the query vector and the summary vectors to locate images having the closest meaning. The invention is also applicable to video or temporally related images, and can also be used in conjunction with other context vector data domains such as text or audio, thereby linking images to such data domains.
-
Citations
22 Claims
-
1. A computer-implemented method of retrieving images, the method comprising:
-
receiving a text query;
generating a query context vector derived from a context vector of at least one word included in the text query;
comparing the query context vector with a plurality of summary vectors, each summary vector associated with image features derived from image data of one of the plurality of images; and
retrieving at least one image having a summary vector similar to the query context vector;
wherein comparing the query context vector comprises;
computing a dot product between the query context vector and the summary vector of each of a plurality of images and sorting said each of a plurality of images by said computed dot product whereby images associated with summary vectors producing high dot products are retrieved. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
retrieving at least one summary vector with an orientation in a vector space similar to an orientation of the query context vector in the vector space.
-
-
4. The computer-implemented method of claim 3, wherein the orientation of a vector in the vector space is indicative of the meaning of the word or image with which the vector is associated, such that words or images having similar meaning are associated with vectors having similar orientations in the vector space.
-
5. The computer-implemented method of claim 3, wherein the orientation of a vector in the vector space is determined by the frequency of proximal co-occurrences of words or image features in a corpus of records, such that words or image features that frequently proximally co-occur are associated with vectors having similar orientations in the vector space.
-
6. The computer-implemented method of claim 1, wherein the image features are derived from the images by wavelet transformations.
-
7. The computer-implemented method of claim 1, wherein the image features are coefficients of wavelet transformations on the images.
-
8. The computer-implemented method of claim 1, wherein the summary vectors are oriented in a vector space, the axes of the vector space not associated with selected words or image features.
-
9. A computer-implemented process of retrieving images, the process comprising:
-
receiving a query image;
deriving at least one feature vector from image data of the query image;
generating a query context vector from one or more context vectors associated with the at least one feature vector;
comparing the query context vector with a plurality of summary vectors, each summary vector derived from context vectors associated with feature vectors derived from image data of one of the plurality of images; and
retrieving at least one image having a summary vector similar to the query context vector;
wherein comparing the query context vector comprises;
computing a dot product between the query context vector and the summary vector of each of a plurality of images and sorting said each of a plurality of images by said computed dot product whereby images associated with summary vectors producing high dot products are retrieved. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
retrieving at least one summary vector with an orientation in a vector space similar to an orientation of the query context vector in the vector space.
-
-
12. The computer-implemented method of claim 11, wherein the orientation of a vector in the vector space is indicative of the meaning of the word or image with which the vector is associated, such that words or images having similar meaning are associated with vectors having similar orientations in the vector space.
-
13. The computer-implemented method of claim 11, wherein the orientation of a vector in the vector space is determined by the frequency of proximal co-occurrences of words or image features in a corpus of records, such that words or image features that frequently proximally co-occur are associated with vectors having similar orientations in the vector space.
-
14. The computer-implemented method of claim 9, wherein the image features are derived from the images by wavelet transformations.
-
15. The computer-implemented method of claim 9, wherein the image features are coefficients of wavelet transformations on the images.
-
16. The computer-implemented method of claim 9, wherein the summary vectors are oriented in a vector space, the axes of the vector space not associated with selected image features.
-
17. A computer-implemented process of training context vectors for images within documents, the process comprising:
-
for each of a plurality of images, generating a plurality of feature vectors from image data of the image;
for each image, associating each of the image'"'"'s feature vectors with a context vector; and
for each image, aligning each of the context vectors of the image using a context vector of at least one word included in a document containing the image;
wherein aligning each of the context vectors of the image comprises;
adjusting the image context vector to be more similar to the summary vector of the at least one word included in the document containing the image; and
comparing the image context vector with the summary vector by computing a dot product between the query context vector and the summary vector. - View Dependent Claims (18, 19)
initializing the context vectors to be substantially orthogonal to each other.
-
-
19. The computer-implemented process of claim 17, wherein the context vectors are oriented in a vector space, the axes of the vector space not associated with selected words or image features.
-
20. A computer-implemented process of training context vectors for images within documents, the process comprising:
-
providing a plurality of word context vectors, each context vector having an orientation in a vector space, such that words having similar meaning have context vectors with similar orientations in the vector space;
providing a plurality of image context vectors, each image context vector associated with a feature vector, each feature vector derived from image data of at least one image, each image context vector having an orientation in the vector space; and
for each document containing an image, aligning the image context vectors associated with the feature vectors derived from the image, with a summary vector derived from context vectors of selected words contained in the document;
wherein aligning the image context vectors with a summary vector derived from context vectors of selected words contained in the document comprises;
adjusting the image context vector to be more similar to the summary vector of the selected words included In the document; and
comparing the image context vectors with the summary vector by computing a dot product between the query context vectors and the summary vector. - View Dependent Claims (21)
-
-
22. A computer-implemented method of retrieving records having different media types, the method comprising:
- providing a plurality of first records, each first record having a first media type;
for each of the first record having the first media type, deriving from elements of the first record a context vector, the context vector having an orientation in a vector space;
providing a plurality of second records, each second record having a second media type;
for each of the second record having the second media type, deriving from elements of the second record a context vector, the context vector having an orientation in the vector space;
receiving a query, comprising at least one element of the first media type;
deriving a query context vector from the query; and
retrieving at least one second record having a context vector similar to the query context vector;
wherein comparing the query vector with said context vector comprises computing a dot product between the query context vectors and the summary vector.
- providing a plurality of first records, each first record having a first media type;
Specification