×

Unsupervised, automated web host dynamicity detection, dead link detection and prerequisite page discovery for search indexed web pages

  • US 20060294052A1
  • Filed: 08/13/2005
  • Published: 12/28/2006
  • Est. Priority Date: 06/28/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • retrieving, from a site host, pages associated with the site, wherein the pages contain content;

    determining how dynamic the content of the site is, based on the degree to which the content of the retrieved pages changed since a previous crawl of the site;

    if the content of the site is determined dynamic, in relation to a corresponding threshold, then continuing retrieving, from the site host, pages associated with the site; and

    if the content of the site is determined not dynamic, in relation to the corresponding threshold, then not retrieving, from the site host, a subset of pages associated with the domain.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×