×

Extracting structured data from weblogs

  • US 20060287989A1
  • Filed: 06/16/2006
  • Published: 12/21/2006
  • Est. Priority Date: 06/16/2005
  • Status: Active Grant
First Claim
Patent Images

1. A method of extracting individual posts from a weblog, comprising the steps of:

  • (a) accessing the home page of the weblog;

    (b) identifying at least one feed associated with the weblog;

    (c) determining whether the feed contains sufficient content for performing feed-guided segmentation;

    (d) if the feed contains sufficient content for feed-guided segmentation, determining whether the feed contains full content or partial content of the weblog;

    (e) if the feed contains full content of the weblog, mapping the data found in the feed into a representation for weblog posts; and

    (f) if the feed contains partial content of the weblog, screen scraping the weblog into a representation for weblog posts using the feed data.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×