Method for detecting link spam in hyperlinked databases
First Claim
1. A computer-implemented method for identifying nodes that are beneficiaries of node importance inflating links in a directed graph of linked nodes, wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database, the method comprising:
- at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of;
computing, for each of at least a subset of the nodes in the directed graph, a respective quantity corresponding to a mathematical derivative of a node importance function;
for each node in the subset, comparing the respective computed quantity with a threshold;
identifying, as nodes that are beneficiaries of node importance inflating links, at least a portion of the subset for which the comparison produces a predefined result; and
performing a remedial action on a plurality of the identified nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods for facilitating the identification of link spamming in a linked database include calculating a spam likelihood value for nodes in a directed graph of linked nodes are disclosed. The spam likelihood value is computed from an importance of the node and a derivative value of the importance function with respect to a coupling factor. The likelihood that the node'"'"'s importance is inflated by link spam is estimated by calculating the ratio of the magnitude of the derivative value for the node to the rank for the node. Alternatively, the spam likelihood may be computed directly from a component of the principal eigenvector of A evaluated at two values of the parameter c. The normalized derivative value can also be used to provide an order of importance in a list of nodes.
-
Citations
25 Claims
-
1. A computer-implemented method for identifying nodes that are beneficiaries of node importance inflating links in a directed graph of linked nodes, wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database, the method comprising:
-
at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of; computing, for each of at least a subset of the nodes in the directed graph, a respective quantity corresponding to a mathematical derivative of a node importance function; for each node in the subset, comparing the respective computed quantity with a threshold; identifying, as nodes that are beneficiaries of node importance inflating links, at least a portion of the subset for which the comparison produces a predefined result; and performing a remedial action on a plurality of the identified nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer readable storage medium storing one or more programs for execution by one or more processors of a computer system, the one or more programs comprising:
-
instructions for computing, for each of at least a subset of nodes in a directed graph of linked nodes, a respective quantity corresponding to a mathematical derivative of a node importance function; instructions for comparing the respective computed quantity with a threshold for each node in the subset; and instructions for identifying, as nodes that are beneficiaries of node importance inflating links, at least a portion of the subset for which the comparison produces a predefined result; and instructions for performing a remedial action on a plurality of the identified nodes; wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system for identifying nodes that are beneficiaries of node importance inflating links in a directed graph of linked nodes, the system comprising:
-
memory; one or more processors; one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs including; instructions to compute, for each of at least a subset of the nodes in the directed graph, a respective quantity corresponding to a mathematical derivative of a node importance function; instructions to compare, for each node in the subset, the respective computed quantity with a threshold; and instructions to identify, as nodes that are beneficiaries of node importance inflating links, at least a portion of the subset for which the comparison produces a predefined result; instructions to perform a remedial action on a plurality of the identified nodes; wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database.
-
-
20. A computer-implemented method for performing search engine services using a directed graph of linked nodes, wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database, the method comprising:
-
at a computer server system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of; computing, for each node of at least a portion of the nodes in the directed graph, a respective quantity for the respective node corresponding to a mathematical derivative of a node importance function; and in response to a search query received from a client system, ordering a subset of the nodes in the directed graph in accordance with the respective quantities, and returning to the client system a search result in accordance with the ordered subset of the nodes. - View Dependent Claims (21)
-
-
22. A non-transitory computer readable storage medium storing one or more programs for execution by one or more processors of a computer server system, the one or more programs comprising:
-
instructions for computing, for each node of at least a portion of the nodes in a directed graph, a respective quantity for the respective node corresponding to a mathematical derivative of a node importance function; and instructions for responding to a search query received from a client system, including instructions for ordering a subset of the nodes in the directed graph in accordance with the respective quantities, and returning to the client system a search result in accordance with the ordered subset of the nodes; wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database. - View Dependent Claims (23)
-
-
24. A computer system for performing search engine services using a directed graph of linked nodes, wherein the directed graph of linked nodes corresponds to a linked database, and wherein the nodes correspond to documents within the linked database, the system comprising:
-
memory; one or more processors; one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs including; instructions to compute, for each of at least a subset of the nodes in the directed graph, a respective quantity corresponding to a mathematical derivative of a node importance function; and instructions for responding to a search query received from a client system, including instructions for ordering a subset of the nodes in the directed graph in accordance with the respective quantities, and returning to the client system a search result in accordance with the ordered subset of the nodes. - View Dependent Claims (25)
-
Specification