×

Scheduler for Search Engine Crawler

  • US 20100241621A1
  • Filed: 05/25/2010
  • Published: 09/23/2010
  • Est. Priority Date: 07/03/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method of scheduling document indexing, comprising:

  • at a search engine crawler system having one or more processors and memory storing programs for execution by the one or more processors;

    retrieving a number of document identifiers, each document identifier identifying a corresponding document on a network; and

    for each retrieved document identifier and its corresponding document,determining a query-independent score indicative of a rank of the corresponding document relative to other documents in a set of documents;

    determining a content change frequency of the corresponding document by comparing information stored for successive downloads of the corresponding document;

    determining a first score for the document identifier that is a function of both the determined query-independent score and the determined content change frequency of the corresponding document;

    comparing the first score against a threshold value; and

    conditionally scheduling the document for indexing based on the result of the comparison.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×