Identifying central entities
First Claim
1. A method implemented by a data processing apparatus, the method comprising:
- identifying multiple candidate entities that are associated with a first web resource, each candidate entity being a word or a phrase;
obtaining a first entity graph representing relationships between entities associated with resources in a collection of resources, wherein the first entity graph includes multiple nodes, each node representing a different entity associated with a respective resource in the collection of resources, each entity being a word or a phrase, wherein the first entity graph includes edges connecting pairs of nodes, and wherein each of the edges represents that two nodes connected by an edge represent two entities that are frequently associated with a same resource in the collection of resources;
filtering the first entity graph to remove nodes that do not represent any of the candidate entities associated with the first web resource;
generating, from the filtered first entity graph, a second entity graph for the first resource, including removing nodes from the filtered first entity graph that are not connected by an edge to at least one other node in the filtered first entity graph;
identifying candidate entities that are represented by respective nodes in the second entity graph as being central entities for the first resource;
generating respective search queries for each of the identified central entities;
obtaining search results responsive to the search queries from a search engine;
selecting a web resource referenced by a particular search result of the obtained search results as relevant additional content for the first web resource; and
associating the relevant additional content with the first web resource for presentation to a user requesting content from the first web resource.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying central entities. In one aspect, a method includes obtaining candidate entities for a first resource; filtering a first entity graph whose nodes represent different entities found in a plurality of resources to remove nodes that do not correspond to a candidate entity, wherein pairs of nodes in the filtered first entity graph that are connected by an edge correspond to pairs of candidate entities that are associated with the same resource; generating a second entity graph for the first resource from the filtered first entity graph, wherein the second entity graph does not include nodes from the filtered first entity graph that are not connected to other nodes in the filtered first graph; and identifying candidate entities that are represented by nodes in the second entity graph as being central entities for the first resource.
-
Citations
33 Claims
-
1. A method implemented by a data processing apparatus, the method comprising:
-
identifying multiple candidate entities that are associated with a first web resource, each candidate entity being a word or a phrase; obtaining a first entity graph representing relationships between entities associated with resources in a collection of resources, wherein the first entity graph includes multiple nodes, each node representing a different entity associated with a respective resource in the collection of resources, each entity being a word or a phrase, wherein the first entity graph includes edges connecting pairs of nodes, and wherein each of the edges represents that two nodes connected by an edge represent two entities that are frequently associated with a same resource in the collection of resources; filtering the first entity graph to remove nodes that do not represent any of the candidate entities associated with the first web resource; generating, from the filtered first entity graph, a second entity graph for the first resource, including removing nodes from the filtered first entity graph that are not connected by an edge to at least one other node in the filtered first entity graph; identifying candidate entities that are represented by respective nodes in the second entity graph as being central entities for the first resource; generating respective search queries for each of the identified central entities; obtaining search results responsive to the search queries from a search engine; selecting a web resource referenced by a particular search result of the obtained search results as relevant additional content for the first web resource; and associating the relevant additional content with the first web resource for presentation to a user requesting content from the first web resource. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable storage device having stored thereon instructions, which, when executed by data processing apparatus, cause the data processing apparatus to perform operations comprising:
-
identifying multiple candidate entities that are associated with a first web resource, each candidate entity being a word or a phrase; obtaining a first entity graph representing relationships between entities associated with resources in a collection of resources, wherein the first entity graph includes multiple nodes, each node representing a different entity associated with a respective resource in the collection of resources, each entity being a word or a phrase, wherein the first entity graph includes edges connecting pairs of nodes, and wherein each of the edges represents that two nodes connected by an edge represent two entities that are frequently associated with a same resource in the collection of resources; filtering the first entity graph to remove nodes that do not represent any of the candidate entities associated with the first web resource; generating, from the filtered first entity graph, a second entity graph for the first resource, including removing nodes from the filtered first entity graph that are not connected by an edge to at least one other node in the filtered first entity graph; identifying candidate entities that are represented by respective nodes in the second entity graph as being central entities for the first resource; generating respective search queries for each of the identified central entities; obtaining search results responsive to the search queries from a search engine; selecting a web resource referenced by a particular search result of the obtained search results as relevant additional content for the first web resource; and associating the relevant additional content with the first web resource for presentation to a user requesting content from the first web resource. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A system comprising:
-
one or more data processing apparatus; and a computer-readable storage device having stored thereon instructions that, when executed by the one or more data processing apparatus, cause the one or more data processing apparatus to perform operations comprising; identifying multiple candidate entities that are associated with a first web resource, each candidate entity being a word or a phrase; obtaining a first entity graph representing relationships between entities associated with resources in a collection of resources, wherein the first entity graph includes multiple nodes, each node representing a different entity associated with a respective resource in a the collection of resources, each entity being a word or a phrase, wherein the first entity graph includes edges connecting pairs of nodes, and wherein each of the edges represents that two nodes connected by an edge represent two entities that are frequently associated with a same resource in the collection of resources; filtering the first entity graph to remove nodes that do not represent any of the candidate entities associated with the first web resource; generating, from the filtered first entity graph, a second entity graph for the first resource, including removing nodes from the filtered first entity graph that are not connected by an edge to at least one other node in the filtered first entity graph; identifying candidate entities that are represented by respective nodes in the second entity graph as being central entities for the first resource; generating respective search queries for each of the identified central entities; obtaining search results responsive to the search queries from a search engine; selecting a web resource referenced by a particular search result of the obtained search results as relevant additional content for the first web resource; and associating the relevant additional content with the first web resource for presentation to a user requesting content from the first web resource. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
Specification