×

Limiting requests by web crawlers to a web host

  • US 7,774,782 B1
  • Filed: 12/18/2003
  • Issued: 08/10/2010
  • Est. Priority Date: 12/18/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of limiting requests to a web host by multiple competing web crawlers of a search engine, comprising on a server system:

  • receiving from a plurality of web crawlers a stream of capacity requests for a plurality of web hosts, each web host having a specified maximum allowed load level comprising a predefined number of web crawler download requests per unit time;

    for each pair of requesting web crawler and requested web host, creating a lease between the web host and the web crawler, the lease including an identity of the web crawler, an identity of the web host, a load capacity allocated to the web crawler and a lease update time prior to a lease expire time at which the lease expires unless the lease is extended;

    wherein the load capacity allocated to the web crawler comprises a specified number of download requests per unit time; and

    upon arrival of a respective lease'"'"'s lease update time and satisfaction of a predefined condition, automatically updating the respective lease between a respective web crawler of the plurality of web crawlers and a respective web host of the plurality of web hosts by granting the web crawler an updated share of the web host'"'"'s maximum allowed load level, the updated lease having an updated lease expire time later than the lease update time;

    wherein creating the lease includes limiting the load capacity allocated to the requesting web crawler such that a sum of the load capacity allocated to each of the web crawlers having a lease with the requested web host is no greater than the requested web host'"'"'s maximum allowed load level.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×