Identifying key images in a document in correspondence to document text
First Claim
1. A computer-implemented method, comprising:
- receiving a document comprising text and a plurality of images, each of the plurality of images having a location in the document;
extracting one or more document keywords from the document;
generating a proximity factor for each pair of one of the plurality of images and one of the document keywords, the proximity factor reflecting a degree of correlation between the image and the document keyword of the pair; and
determining the importance of each of the plurality of images according to an image metric that combines the proximity factors for each document keyword and image pair.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method and system for identifying key images in a document is provided. The operations used include extracting one or more document keywords from the document considered important in describing the document, collecting one or more images associated with the document including information describing each image, generating a proximity factor for each image collected from the document and each document keyword that reflects the degree of correlation between the image and the document keyword, and determining the importance of each image according to an image metric that combines the proximity factors for each document keyword and image pair. In addition, the operations may also include ordering the document keywords according to an ordering criterion and weighting the proximity factor associated with each document keyword and image pair based on the order of the document keyword.
-
Citations
26 Claims
-
1. A computer-implemented method, comprising:
-
receiving a document comprising text and a plurality of images, each of the plurality of images having a location in the document;
extracting one or more document keywords from the document;
generating a proximity factor for each pair of one of the plurality of images and one of the document keywords, the proximity factor reflecting a degree of correlation between the image and the document keyword of the pair; and
determining the importance of each of the plurality of images according to an image metric that combines the proximity factors for each document keyword and image pair. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus comprising:
-
means for receiving a document comprising text and a plurality of images, each of the plurality of images having a location in the document;
means for extracting one or more document keywords from the document;
means for generating a proximity factor for each pair of one of the plurality of images and one of the document keywords, the proximity factor reflecting a degree of correlation between the image and the document keyword of the pair; and
means for determining the importance of each of the plurality of images according to an image metric that combines the proximity factors for each document keyword and image pair. - View Dependent Claims (14)
-
-
15. A computer program product, tangibly embodied in a machine-readable storage device, the product comprising instructions operable to cause a computer to:
-
receive a document comprising text and a plurality of images, each of the plurality of images having a location in the document;
extract one or more document keywords from the document;
generate a proximity factor for each pair of one of the plurality of images and one of the document keywords, the proximity factor reflecting a degree of correlation between the image and the document keyword of the pair; and
determine the importance of each of the plurality of images according to an image metric that combines the proximity factors for each document keyword and image pair. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification