Systems and methods for identifying semantically and visually related content
First Claim
1. A method, performed by a computing system having a memory and a processor, the method comprising:
- receiving an indication of a plurality of content items;
for each of the plurality of content items,identifying at least one content element of the content item, andfor each identified content element of the content item,extracting information for the content element,computing a plurality of semantic feature values for the content element based at least in part on the extracted information,computing a plurality of visual feature values for the content element based at least in part on the extracted information, andstoring the feature values computed, for the content element, based at least in part on the extracted information;
receiving an indication of a first set of content items, wherein each content item of the first set of content items is a member of the plurality of content items;
for each pair of content items from among the first set of content items,applying a similarity function to the pair of content items to generate a similarity value, andstoring, in association with each content item of the pair of content items, the generated similarity value;
for each content item of the first set of content items,identifying similar content items based at least in part on the generated similarity values, andstoring, in association with the content item, references to at least one of the identified similar content items.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for identifying semantically and/or visually related information among a set of content items, such content items that include similar concepts or that have similar visual aspects, are disclosed. The disclosed techniques provide tools for identifying related information among various content items, such as text pages and documents, presentation slides and slide decks, etc. The disclosed techniques provide improved methods for searching among content items, organizing content items into categories, and pruning redundant content. Furthermore, the disclosed techniques provide improvements to computation of various metrics, including usage, performance, and impact metrics.
70 Citations
23 Claims
-
1. A method, performed by a computing system having a memory and a processor, the method comprising:
-
receiving an indication of a plurality of content items; for each of the plurality of content items, identifying at least one content element of the content item, and for each identified content element of the content item, extracting information for the content element, computing a plurality of semantic feature values for the content element based at least in part on the extracted information, computing a plurality of visual feature values for the content element based at least in part on the extracted information, and storing the feature values computed, for the content element, based at least in part on the extracted information; receiving an indication of a first set of content items, wherein each content item of the first set of content items is a member of the plurality of content items; for each pair of content items from among the first set of content items, applying a similarity function to the pair of content items to generate a similarity value, and storing, in association with each content item of the pair of content items, the generated similarity value; for each content item of the first set of content items, identifying similar content items based at least in part on the generated similarity values, and storing, in association with the content item, references to at least one of the identified similar content items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for analyzing content items, the system comprising:
-
at least one processor; a component configured to receive an indication of a plurality of content items; a component configured to, for each of the plurality of content items, identify at least one content element of the content item, and for each identified content element of the content item, extract information for the content element, compute a plurality of semantic feature values for the content element based at least in part on the extracted information, compute a plurality of visual feature values for the content element based at least in part on the extracted information, and store the feature values computed, for the content element, based at least in part on the extracted information; a component configured to receive an indication of a first set of content items, wherein each content item of the first set of content items is a member of the plurality of content items; a component configured to, for each pair of content items from among the first set of content items, apply a similarity function to the pair of content items to generate a similarity value for the pair of content items, and store, in association with each content item of the pair of content items, the generated similarity value; and a component configured to, for each content item of the first set of content items, identify a set of similar content items based at least in part on the generated similarity values, and store, in association with the content item, references to at least one content item of the identified set of similar content items, wherein each of the components comprises computer-executable instructions stored in a memory for execution by the at least one processor. - View Dependent Claims (14, 15, 16, 17, 21, 22, 23)
-
-
18. A computer-readable storage medium, that is not a transitory, propagating signal, storing instructions that, when executed by a computing system having a processor, cause the computing system to perform a method comprising:
-
receiving an indication of a plurality of content items; and for each of the plurality of content items, identifying at least one content element of the content item, and for each identified content element of the content item, extracting information for the content element, computing a plurality of feature values for the content element based at least in part on the extracted information, wherein computing the plurality of feature values for the content element based at least in part on the extracted information comprises computing a plurality of visual feature values for the content element based at least in part on the extracted information, and wherein computing the plurality of feature values for the content element based at least in part on the of semantic feature values for the content element based at least in part on the extracted information, and storing the feature values computed for the content element based at least in part on the extracted information; and receiving an indication of a first set of content items, wherein each content item of the first set of content items is a member of the plurality of content items; for each pair of content items from among the first set of content items, applying a similarity function to the pair of content items to generate a similarity value, and storing, in association with each content item of the pair of content items, the generated similarity value; for each content item of the first set of content items, identifying similar content items based at least in part on the generated similarity values, and storing, in association with the content item references to at least one of the identified similar content items. - View Dependent Claims (19, 20)
-
Specification