×

Identifying internet protocol addresses for internet hosting entities

  • US 8,688,681 B1
  • Filed: 06/17/2010
  • Issued: 04/01/2014
  • Est. Priority Date: 06/17/2009
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising one or more computers programmed to perform operations comprising:

  • maintaining an Internet Protocol (IP) address history for each hostname in a plurality of hostnames, where each IP address history is a time series of IP addresses;

    organizing the hostnames into a collection of groups so that each hostname of the plurality of hostnames is a member of exactly one group in the collection of groups, where each group has a kernel calculated from the IP address histories of the members of the group, and where the IP address history of each member of the group is within a threshold distance of the kernel of the group;

    providing to a crawler, for use in scheduling a crawl of the plurality of hostnames, data describing the collection of groups;

    receiving an update to an IP address history for a first hostname of the plurality of hostnames, the first hostname being a member of a first group of the collection of groups, and recalculating a first kernel of the first group using the updated IP address history of the first hostname;

    receiving an update to an IP address history for a second hostname, the second hostname being a member of a second group of the collection of groups, and recalculating a second kernel of the second group using the updated IP address history of the second hostname; and

    determining that the first kernel is within the threshold distance of the second kernel and, as a result, merging the first group and the second group into a single group in the collection of groups.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×