×

Adaptive crawl rates based on publication frequency

  • US 8,255,385 B1
  • Filed: 03/22/2011
  • Issued: 08/28/2012
  • Est. Priority Date: 03/22/2011
  • Status: Active Grant
First Claim
Patent Images

1. A system for adaptively deploying a Web-crawler to a Web source at a crawl rate based on historical publication data for the Web source, the system comprising:

  • a computing device associated with a search engine having one or more processors and one or more computer-readable storage media; and

    a data store coupled with the search engine,wherein the search engine;

    determines an update frequency estimation for a particular Web page, the update frequency estimation being determined according to;


    Fi

    k=1iwkFi-k A) wherein Fi is the update frequency estimation for the particular Web page for a time period, i,B) wherein wk is a weight factor given to a particular time segment, k, included in the time period, andC) wherein Fi-k is a publication rate for a given time segment included in the current time period; and

    calculates the adaptive crawl rate by multiplying the update frequency estimation for the particular Web page by a constant.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×