×

Method of web crawling utilizing crawl numbers

  • US 6,638,314 B1
  • Filed: 06/26/1998
  • Issued: 10/28/2003
  • Est. Priority Date: 06/26/1998
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer based method of retrieving information from a computer network (Web) having a plurality of electronic documents stored thereon, wherein each electronic document has a corresponding document address specification that provides information for locating the electronic document, the method including performing a current Web crawl comprising:

  • assigning a current crawl number to the current Web crawl, said current crawl number being the next number in a numerical sequence of numbers;

    determining whether an electronic document has been retrieved during a previous Web crawl and associated with a crawl number modified;

    if the electronic document has not been retrieved during a previous Web crawl and associated with a crawl number modified, associating the current crawl number with the electronic document as its crawl number modified;

    if the electronic document has been retrieved during a previous Web crawl and associated with a crawl number modified, determining whether the actual content of the electronic document has been modified subsequent to the previous retrieval; and

    if the actual content of the electronic document has been modified subsequent to the previous retrieval, associating the current crawl number with the electronic document as its crawl number modified.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×