Using exceptional changes in webgraph snapshots over time for internet entity marking
First Claim
1. A method comprising performing a machine-executed operation involving instructions, wherein the machine-executed operation is at least one of:
- A) sending said instructions over transmission media;
B) receiving said instructions over transmission media;
C) storing said instructions onto a machine-readable storage medium; and
D) executing the instructions;
wherein said instructions are instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of;
measuring, at a first time, values of one or more attributes of a particular entity that is linked to one or more other entities in a set of entities;
measuring values of the one or more attributes at a second time that differs from the first time; and
determining whether an extent of change between the values of the one or more attributes measured at the first time and the values of the one or more attributes measured at the second time exceeds a specified threshold; and
if the extent of change exceeds the specified threshold, then identifying the particular entity as a suspicious entity in the set of entities;
wherein the one or more attributes comprise at least one of (a) a number of links outgoing from the particular entity, (b) a number of links incoming to the particular entity, and (c) a number of sub-entities contained within the entity.
9 Assignments
0 Petitions
Accused Products
Abstract
Techniques are provided through which “suspicious” web pages may be identified automatically. A “suspicious” web page possesses characteristics that indicate some manipulation to artificially inflate the position of the web page within ranked search results. Web pages may be represented as nodes within a graph. Links between web pages may be represented as directed edges between the nodes. “Snapshots” of the current state of a network of interlinked web pages may be automatically generated at different times. In the time interval between snapshots, the state of the network may change. By comparing an earlier snapshot to a later snapshot, such changes can be identified. Extreme changes, which are deemed to vary significantly from the normal range of expected changes, can be detected automatically. Web pages relative to which these extreme changes have occurred may be marked as suspicious web pages which may merit further investigation or action.
-
Citations
20 Claims
-
1. A method comprising performing a machine-executed operation involving instructions, wherein the machine-executed operation is at least one of:
-
A) sending said instructions over transmission media;
B) receiving said instructions over transmission media;
C) storing said instructions onto a machine-readable storage medium; and
D) executing the instructions;
wherein said instructions are instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of;
measuring, at a first time, values of one or more attributes of a particular entity that is linked to one or more other entities in a set of entities;
measuring values of the one or more attributes at a second time that differs from the first time; and
determining whether an extent of change between the values of the one or more attributes measured at the first time and the values of the one or more attributes measured at the second time exceeds a specified threshold; and
if the extent of change exceeds the specified threshold, then identifying the particular entity as a suspicious entity in the set of entities;
wherein the one or more attributes comprise at least one of (a) a number of links outgoing from the particular entity, (b) a number of links incoming to the particular entity, and (c) a number of sub-entities contained within the entity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising performing a machine-executed operation involving instructions, wherein the machine-executed operation is at least one of:
-
A) sending said instructions over transmission media;
B) receiving said instructions over transmission media;
C) storing said instructions onto a machine-readable storage medium; and
D) executing the instructions;
wherein said instructions are instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of;
measuring, at a first time, for each host of a plurality of hosts, a number of links associated with that host;
measuring, at a second time that differs from the first time, for each host of the plurality of hosts, a number of links associated with that host;
determining, for each host of the plurality of hosts, a rate of growth in a number of links associated with that host between the first time and the second time; and
identifying selected hosts, in the plurality of hosts, which are associated with rates of growth that exceed a specified threshold. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method comprising performing a machine-executed operation involving instructions, wherein the machine-executed operation is at least one of:
-
A) sending said instructions over transmission media;
B) receiving said instructions over transmission media;
C) storing said instructions onto a machine-readable storage medium; and
D) executing the instructions;
wherein said instructions are instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of;
measuring, at a first time, for each domain of a plurality of domains, a number of hosts contained within that domain;
measuring, at a second time that differs from the first time, for each domain of the plurality of domains, a number of hosts contained within that domain;
determining, for each domain of the plurality of domains, a rate of growth in a number of hosts contained within that domain between the first time and the second time; and
identifying selected domains, in the plurality of domains, which are associated with rates of growth that exceed a specified threshold. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification