System and method for ranking search results using click distance
First Claim
1. A system for ranking search results, comprising a search engine on a computing device, the search engine configured to execute computer-executable instructions, which when executed by the computing system cause the computing system to perform a method comprising:
- discovering a plurality of documents on a network;
recording document and link information for each of the plurality of documents on the network;
generating a representation of the network from the document and link information, wherein the representation of the network includes a plurality of nodes, each document being represented by one of the plurality of nodes;
receiving manually input click distances designating a subset of the plurality of nodes as highest authority nodes, the subset of the plurality of nodes including at least a first highest authority node and a second highest authority node, wherein the manually input click distances indicate a relative importance of each highest authority node with respect to other highest authority nodes;
initializing click distances for a second subset of the plurality of nodes to a maximum value, the second subset of the plurality of nodes not including highest authority nodes and comprising at least a first non-highest authority node and a second non-highest authority node;
computing click distances for the first and second non-highest authority nodes by;
determining a first click distance for the first non-highest authority node, the first click distance being a first number of branches traversed on a first shortest path between the first non-highest authority node and the first highest authority node;
determining a second click distance for the first non-highest authority node, the second click distance being a second number of branches traversed on a second shortest path between the first non-highest authority node and the second highest authority node;
determining a third click distance for the second non-highest authority node, the third click distance being a third number of branches traversed on a third shortest path between the second non-highest authority node and the first highest authority node; and
determining a fourth click distance for the second non-highest authority node, the fourth click distance being a fourth number of branches traversed on a fourth shortest path between the second non-highest authority node and the second highest authority node;
storing the first, second, third, and fourth click distances in memory, such that the first and second click distances are associated with a first document, and the third and fourth click distances are associated with a second document;
calculating search rank results using at least one of the first, second, third, and fourth click distances associated with each of the first and second documents as a query-independent relevance measure in ranking the plurality of documents; and
storing search rank results in memory, wherein the search rank results comprise a list of documents arranged in a descending order of relevance.
2 Assignments
0 Petitions
Accused Products
Abstract
Search results of a search query on a network are ranked according to an additional click distance property associated with each of the documents on the network. The click distance is measurement of the number clicks or user navigations from a page or pages on the network designated as highest authority or root pages on the network. The precision of the results is increased by the addition of the click distance term when the site or intranet where the search query takes place is hierarchically structured.
-
Citations
20 Claims
-
1. A system for ranking search results, comprising a search engine on a computing device, the search engine configured to execute computer-executable instructions, which when executed by the computing system cause the computing system to perform a method comprising:
-
discovering a plurality of documents on a network; recording document and link information for each of the plurality of documents on the network; generating a representation of the network from the document and link information, wherein the representation of the network includes a plurality of nodes, each document being represented by one of the plurality of nodes; receiving manually input click distances designating a subset of the plurality of nodes as highest authority nodes, the subset of the plurality of nodes including at least a first highest authority node and a second highest authority node, wherein the manually input click distances indicate a relative importance of each highest authority node with respect to other highest authority nodes; initializing click distances for a second subset of the plurality of nodes to a maximum value, the second subset of the plurality of nodes not including highest authority nodes and comprising at least a first non-highest authority node and a second non-highest authority node; computing click distances for the first and second non-highest authority nodes by; determining a first click distance for the first non-highest authority node, the first click distance being a first number of branches traversed on a first shortest path between the first non-highest authority node and the first highest authority node; determining a second click distance for the first non-highest authority node, the second click distance being a second number of branches traversed on a second shortest path between the first non-highest authority node and the second highest authority node; determining a third click distance for the second non-highest authority node, the third click distance being a third number of branches traversed on a third shortest path between the second non-highest authority node and the first highest authority node; and determining a fourth click distance for the second non-highest authority node, the fourth click distance being a fourth number of branches traversed on a fourth shortest path between the second non-highest authority node and the second highest authority node; storing the first, second, third, and fourth click distances in memory, such that the first and second click distances are associated with a first document, and the third and fourth click distances are associated with a second document; calculating search rank results using at least one of the first, second, third, and fourth click distances associated with each of the first and second documents as a query-independent relevance measure in ranking the plurality of documents; and storing search rank results in memory, wherein the search rank results comprise a list of documents arranged in a descending order of relevance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for ranking search results, the method comprising:
-
discovering a plurality of documents stored on one or more computing devices in a network of computing devices; recording document and link information for each of the plurality of documents; generating a representation of the network from the document and link information, wherein the representation of the network includes a plurality of nodes, each document being represented by one of the plurality of nodes; receiving manually input click distances designating a subset of the plurality of nodes as highest authority nodes, the subset of the plurality of nodes including at least a first highest authority node and a second highest authority node, wherein the manually input click distances indicate a relative importance of each highest authority node with respect to other highest authority nodes; initializing click distances for a second subset of the plurality of nodes to a maximum value, the second subset of the plurality of nodes not including the highest authority nodes and comprising at least a first non-highest authority node and a second non-highest authority node; computing click distances for the first and second non-highest authority nodes by; determining a first click distance for the first non-highest authority node, the first click distance being a first number of branches traversed on a first shortest path between the first non-highest authority node and the first highest authority node; determining a second click distance for the first non-highest authority node, the second click distance being a second number of branches traversed on a second shortest path between the first non-highest authority node and the second highest authority node; determining a third click distance for the second non-highest authority node, the third click distance being a third number of branches traversed on a third shortest path between the second non-highest authority node and the first highest authority node; and determining a fourth distance for the second non-highest authority node, the fourth click distance being a fourth number of branches traversed on a fourth shortest path between the second non-highest authority node and the second highest authority node; storing the first, second, third, and fourth click distances in memory on a computing device in the network, such that the first and second click distances are associated with a first document, and the third and fourth click distances are associated with a second document; calculating search rank results using at least one of the first, second, third, and fourth click distances associated with each of the first and second documents as a query-independent relevance measure in ranking the plurality of documents; and storing the search rank results in the memory on the computing device. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification