×

EXTRACTING STRUCTURED DATA FROM WEB FORUMS

  • US 20100211533A1
  • Filed: 02/18/2009
  • Published: 08/19/2010
  • Est. Priority Date: 02/18/2009
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-implemented process for extracting structured data from web forums, comprising:

  • training a model for predicting the probability of given data structures existing a web forum by using training web forum sites, and an associated set of features and a web forum sitemap for each of the training web forum sites;

    inputting a set of one or more target web forum sites and associated target web forum sitemaps;

    extracting features from the one or more input target web forum sites using the associated target web forum sitemaps; and

    using the trained model and the extracted features from the one or more input web forum sites to extract data from the one or more input target web forum sites.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×