Generating search requests from multimodal queries
First Claim
1. A method in a device for generating a search request for a multimodal query with a query image and query text, the query image being stored in electronic form, the method comprising:
- providing a collection of images and associated words;
generating a word-to-image index that maps words to associated images and an image-to-related-information index that maps images to associated keywords;
receiving a multimodal query that includes a query image and query text;
identifying images of the collection based on textual relatedness between an associated word and the query text;
wherein the images are identified by searching the word-to-image index to locate images with associated words that are related to the query text;
selecting images of the identified images based on visual relatedness between an identified image and the query image, wherein the selecting comprising extracting a feature vector for the identified image, determining the distance between the extracted feature vector and the feature vector of each image of the collection, and selecting the images based on the determined distance;
generating a search request based on keywords associated with the selected images, wherein the search request is generated from keywords of the image-to-related-information index;
submitting the generated search request to a search engine for identifying documents related to the multimodal query; and
providing an indication of the identified documents as a search result for the multimodal query.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for generating a search request from a multimodal query that includes a query image and query text is provided. The multimodal query system identifies images of a collection that are textually related to the query image based on similarity between words associated with each image and the query text. The multimodal query system then selects those images of the identified images that are visually related to the query image. The multimodal query system may formulate a search request based on keywords of web pages that contain the selected images and submit that search request to a search engine service.
77 Citations
11 Claims
-
1. A method in a device for generating a search request for a multimodal query with a query image and query text, the query image being stored in electronic form, the method comprising:
-
providing a collection of images and associated words; generating a word-to-image index that maps words to associated images and an image-to-related-information index that maps images to associated keywords; receiving a multimodal query that includes a query image and query text; identifying images of the collection based on textual relatedness between an associated word and the query text;
wherein the images are identified by searching the word-to-image index to locate images with associated words that are related to the query text;selecting images of the identified images based on visual relatedness between an identified image and the query image, wherein the selecting comprising extracting a feature vector for the identified image, determining the distance between the extracted feature vector and the feature vector of each image of the collection, and selecting the images based on the determined distance; generating a search request based on keywords associated with the selected images, wherein the search request is generated from keywords of the image-to-related-information index; submitting the generated search request to a search engine for identifying documents related to the multimodal query; and providing an indication of the identified documents as a search result for the multimodal query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-readable storage medium containing instructions for controlling a computer system to find images related to a multimodal query, the instructions perform a method comprising:
-
providing web pages with images; generating a word-to-image index that maps words to associated images and an image-to-related-information index that maps images to associated keywords from the web pages with images; receiving a query image and query text of the multimodal query; identifying images of the web pages based on textual relatedness between a web page and the query text;
wherein the images are identified by searching the word-to-image index to locate images with associated words that are related to the query text;selecting images of the identified images of the web pages based on visual relatedness between an identified image and the query image, wherein the selecting comprising extracting a feature vector for the identified image, determining the distance between the extracted feature vector and the feature vector of each image of the collection, and selecting the images based on the determined distance; generating a search request based on keywords associated with the selected images, wherein the search request is generated from keywords of the image-to-related-information index; submitting the generated search request to a search engine for identifying documents related to the multimodal query; and providing an indication of the identified documents as a search result for the multimodal query. - View Dependent Claims (10, 11)
-
Specification