×

URL AND ANCHOR TEXT ANALYSIS FOR FOCUSED CRAWLING

  • US 20100293116A1
  • Filed: 11/08/2007
  • Published: 11/18/2010
  • Est. Priority Date: 11/08/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A method of Uniform Resource Locator (URL) and anchor text analysis for focused crawling, comprising:

  • training a focused crawler by;

    obtaining a training set for a website;

    computing a score for the training set of at least URL'"'"'s or anchor text;

    extracting a plurality of features of the training set, the features identifying key information contained in the website; and

    computing a score for each of the plurality of features; and

    executing a trained focused crawler on other websites.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×