SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT
First Claim
Patent Images
1. A computing device for determining relevance of a document to a text string, the document containing text and images, the text and the text string having terms, comprising:
- a memory storing computer-executable instructions implementinga component that, for each of a plurality of images contained in the document, identifies text associated with the image by extracting text from the document that is determined to be adjacent to the image, the extracted text being the identified text that is associated with the image;
a component that, for each of the plurality of images, determines relevance of the identified text associated with the image to the text string based on comparison of terms of the identified text to terms of the text string; and
a component that determines relevance of the document to the text string based on the determined relevance of the identified text associated with each of the plurality of the images to the text string; and
a processor for executing the instructions stored in the memory.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.
34 Citations
20 Claims
-
1. A computing device for determining relevance of a document to a text string, the document containing text and images, the text and the text string having terms, comprising:
-
a memory storing computer-executable instructions implementing a component that, for each of a plurality of images contained in the document, identifies text associated with the image by extracting text from the document that is determined to be adjacent to the image, the extracted text being the identified text that is associated with the image; a component that, for each of the plurality of images, determines relevance of the identified text associated with the image to the text string based on comparison of terms of the identified text to terms of the text string; and a component that determines relevance of the document to the text string based on the determined relevance of the identified text associated with each of the plurality of the images to the text string; and a processor for executing the instructions stored in the memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-readable storage device containing computer-executable instructions for controlling a computing device to determine relevance of a document to a text string, the document containing text and an image, the text and the text string having terms, by a method comprising:
-
identifying image text of the document, the image text of the document being a subset of text of the document that is determined to be associated with the image; a component that, determines relevance of the image text to the text string based on comparison of terms of the image text to terms of the text string; and a component that determines relevance of the document to the text string based on the determined relevance of the image text to the text string. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A method in a computing device for determining relevance of web pages to queries, the web pages having text and images, the method comprising:
-
for images of web pages, identifying image text associated with the image by extracting from the web page having the image text that is determined to be associated with the image; receiving a query; generating a search result of web pages for the received query; for each web page in the generated search result, determining relevance of image text associated with an image of the web page to the query; and ranking the web pages of the generated search result based on the determined relevance of the image text of the web pages to the query. - View Dependent Claims (18, 19, 20)
-
Specification