Searching digital information and databases
First Claim
1. A machine-implemented method comprising:
- analyzing digital information viewed as a labeled graph, including nodes and edges, based on a flow of authority among the nodes along the edges, the flow of authority being derived at least in part from different authority transfer rates assigned to the edges based on edge type schema information; and
generating a keyword-specific ranking of the nodes in response to a query, including at least one keyword, based on a result of the analyzing;
wherein the keyword-specific ranking is generated in memory by one or more computers configured to process keyword queries of the digital information comprising words;
the method further comprising;
receiving the query, the query including multiple keywords;
wherein the analyzing the digital information comprises generating multiple initial rankings corresponding to the multiple keywords, each of the multiple initial rankings indicating authority of the nodes with respect to each respective keyword; and
wherein the generating the keyword-specific ranking comprises combining the multiple initial rankings;
wherein the generating the multiple initial rankings comprises topologically sorting the labeled graph, and wherein the generating the multiple initial rankings comprisesidentifying and removing cycles in the labeled graph to reduce the labeled graph into a directed acyclic graph (DAG) and a set of backward edges before doing keyword-specific calculation of the multiple initial rankings;
identifying a set of backnodes, which are nodes of the labeled graph from which the backward edges start; and
calculating node rank information in a bifurcated fashion such that calculation of the node rank information is split between (1) calculating DAG node rank information while ignoring the backward edges and (2) calculating backedges node rank information, due to the backward edges, using the identified backnodes.
5 Assignments
0 Petitions
Accused Products
Abstract
This application describes methods for searching digital information such as digital documents (e.g., web pages) and computer databases, and specific search techniques such as authority ranking and information retrieval (IR) relevance ranking in keyword searches. In some implementations, the technique includes analyzing digital information viewed as a labeled graph, including nodes and edges, based on a flow of authority among the nodes along the edges, the flow of authority being derived at least in part from different authority transfer rates assigned to the edges based on edge type schema information. In some implementations, the system includes an object rank module configured to generate multiple initial rankings corresponding to multiple query keywords, each of the multiple initial rankings indicating authority of nodes in a graph with respect to each respective query keyword individually; and a query module configured to combine the multiple initial rankings in response to a query.
83 Citations
3 Claims
-
1. A machine-implemented method comprising:
-
analyzing digital information viewed as a labeled graph, including nodes and edges, based on a flow of authority among the nodes along the edges, the flow of authority being derived at least in part from different authority transfer rates assigned to the edges based on edge type schema information; and generating a keyword-specific ranking of the nodes in response to a query, including at least one keyword, based on a result of the analyzing; wherein the keyword-specific ranking is generated in memory by one or more computers configured to process keyword queries of the digital information comprising words; the method further comprising; receiving the query, the query including multiple keywords; wherein the analyzing the digital information comprises generating multiple initial rankings corresponding to the multiple keywords, each of the multiple initial rankings indicating authority of the nodes with respect to each respective keyword; and wherein the generating the keyword-specific ranking comprises combining the multiple initial rankings; wherein the generating the multiple initial rankings comprises topologically sorting the labeled graph, and wherein the generating the multiple initial rankings comprises identifying and removing cycles in the labeled graph to reduce the labeled graph into a directed acyclic graph (DAG) and a set of backward edges before doing keyword-specific calculation of the multiple initial rankings; identifying a set of backnodes, which are nodes of the labeled graph from which the backward edges start; and calculating node rank information in a bifurcated fashion such that calculation of the node rank information is split between (1) calculating DAG node rank information while ignoring the backward edges and (2) calculating backedges node rank information, due to the backward edges, using the identified backnodes.
-
-
2. A system comprising:
-
one or more processors; and a machine-readable medium storing a program operable to cause the one or more processors to perform operations, the program comprising; an object rank module configured to generate multiple initial rankings corresponding to multiple query keywords, each of the multiple initial rankings indicating authority of nodes in a graph with respect to each respective query keyword individually; and a query module configured to combine the multiple initial rankings in response to a query; wherein the object rank module is configured to generate the multiple initial rankings based on an analysis of a flow of authority among the nodes along edges in the graph, the flow of authority being derived at least in part from different authority transfer rates assigned to the edges based on edge type schema information of the graph; wherein the object rank module is configured to topologically sort the graph, and wherein the object rank module is configured to; identify and remove cycles in the graph to reduce the graph into a directed acyclic graph (DAG) and a set of backward edges before doing keyword-specific calculation of the multiple initial rankings; identify a set of backnodes; and calculate node rank information in a bifurcated fashion such that calculation of the node rank information is split between (1) calculating DAG node rank information while ignoring the backward edges and (2) calculating backedges node rank information, due to the backward edges, using the identified backnodes.
-
-
3. A program stored in a machine-readable medium and operable to cause one or more machines to perform operations comprising:
-
analyzing digital information viewed as a labeled graph, including nodes and edges, based on a flow of authority among the nodes along the edges, the flow of authority being derived in accordance with one or more factors including different authority transfer rates assigned to the edges based on edge type schema information; and generating a keyword-specific ranking of the nodes in response to a query, including at least one keyword, based on a result of the analyzing; the operations further comprising; receiving the query, the query including multiple keywords; wherein the analyzing the digital information comprises generating multiple initial rankings corresponding to the multiple keywords, each of the multiple initial rankings indicating authority of the nodes with respect to each respective keyword; and wherein the generating the keyword-specific ranking comprises combining the multiple initial rankings; wherein the generating the multiple initial rankings comprises topologically sorting the labeled graph, and wherein the generating the multiple initial rankings comprises identifying and removing cycles in the labeled graph to reduce the labeled graph into a directed acyclic graph (DAG) and a set of backward edges before doing keyword-specific calculation of the multiple initial rankings; identifying a set of backnodes, which are nodes of the labeled graph from which the backward edges start; and calculating node rank information in a bifurcated fashion such that calculation of the node rank information is split between (1) calculating DAG node rank information while ignoring the backward edges and (2) calculating backedges node rank information, due to the backward edges, using the identified backnodes.
-
Specification