CONCEPT-AWARE RANKING OF ELECTRONIC DOCUMENTS WITHIN A COMPUTER NETWORK
First Claim
1. A computer-implemented method comprising:
- extracting a set of concepts from a set of electronic documents from a computer network;
constructing a graph having nodes interconnected by edges, wherein each of the nodes in the graph represents an electronic document in the set of documents and a concept extracted from that electronic document, and further wherein each of the edges in the graph represents a link from a first one of the electronic documents to a second one of the electronic documents for a corresponding one of the concepts;
assigning a rank to each node in the graph based on a number of incoming edges connecting to the node; and
responding to a query with a list containing a subset of the nodes, wherein the list is sorted according to the rank assigned to the nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques are described for ranking the relevance of electronic documents, such as web pages. An algorithm extracts keywords and recurring phrases from the anchor tag data in electronic documents to define a set of concepts. The algorithm then uses link, concept pairs to create nodes in a graph. In this graph, edges can represent both explicit and implicit conceptual links between nodes. By including conceptual data, the algorithm may model and utilize inter-concept relationships when using graph ranking algorithms. This may improve result accuracy by not only retrieving links which are more authoritative given a users'"'"' context, but also by utilizing a larger pool of web pages that are limited by concept-space, rather than keyword-space.
-
Citations
15 Claims
-
1. A computer-implemented method comprising:
-
extracting a set of concepts from a set of electronic documents from a computer network;
constructing a graph having nodes interconnected by edges, wherein each of the nodes in the graph represents an electronic document in the set of documents and a concept extracted from that electronic document, and further wherein each of the edges in the graph represents a link from a first one of the electronic documents to a second one of the electronic documents for a corresponding one of the concepts;
assigning a rank to each node in the graph based on a number of incoming edges connecting to the node; and
responding to a query with a list containing a subset of the nodes, wherein the list is sorted according to the rank assigned to the nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computing device comprising:
-
a concept extraction software module executing on the computer device to extract a set of concepts from a set of electronic documents;
a graphing software module executing on the computing device to construct a graph, wherein each node in the graph refers to an electronic document in the set of documents and a concept extracted from that electronic document, and wherein each edge in the graph represents a conceptual link from a first one of the electronic documents to a second one of the electronic documents along a concept;
a ranking software module executing on the computing device to assign a rank to each node in the graph based on a number of incoming edges connecting to the node; and
a query engine software module executing on the computing device to respond to a query with a list containing a subset of the nodes, wherein the list is sorted according to the rank assigned to the nodes. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-readable medium comprising instructions, the instruction causing a programmable processor to:
-
extract a set of concepts from a set of electronic documents;
constructing a graph, wherein each node in the graph refers to an electronic document in the set of documents and a concept extracted from that electronic document, and wherein each edge in the graph represents a conceptual link from a first one of the electronic documents to a second one of the electronic documents along a concept;
assign a rank to each node in the graph based on a number of incoming edges connecting to the node; and
respond to a query with a list containing a subset of the nodes, wherein the list is sorted according to the rank assigned to the nodes.
-
Specification