×

Method and system for incremental web crawling

  • US 6,631,369 B1
  • Filed: 06/30/1999
  • Issued: 10/07/2003
  • Est. Priority Date: 06/30/1999
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer-based method for performing an incremental crawl of a computer-readable document store in a manner that facilitates an efficient determination of whether and how the document store has been incremented from a prior state, comprising the following acts:

  • (a) determining from the document store whether a deleted documents count (DDC) for a first folder has changed from a value of the DDC as determined during a previous crawl of the document store;

    (b) if the DDC has changed, identifying the documents that have been deleted from the first folder subsequent to the previous crawl; and

    (c) if the DDC has not changed, determining whether a maximum local commit time (MLCT) associated with the first folder is later than a value of the MLCT as determined during the previous crawl, and, if it is later, identifying the documents that have been added to the folder or modified subsequent to the previous crawl.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×