×

Document reuse in a search engine crawler

  • US 10,216,847 B2
  • Filed: 06/08/2017
  • Issued: 02/26/2019
  • Est. Priority Date: 07/03/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising:

  • at a computing system having one or more processors and memory storing one or more programs executed by the one or more processors;

    retrieving a plurality of records corresponding to prior scheduled crawls of respective documents in a plurality of documents; and

    performing a document crawling operation on the plurality of documents, wherein the document crawling operation includes downloading a current version of a respective document from a host computer based on a determination that a document importance score for the respective document exceeds a first threshold, wherein the document importance score is based on a query-independent metric of an importance of the document for a search engine, or reusing a previously downloaded version of a respective document in the plurality of documents instead of downloading a current version of the respective document from a host computer based on a determination that the document importance score does not exceed a first threshold.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×