×

Intellegent data search engine

  • US 8,190,556 B2
  • Filed: 09/08/2009
  • Issued: 05/29/2012
  • Est. Priority Date: 08/24/2006
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for extracting product information displayed on a plurality of web pages on a first remote web site representing a first database and storing said product information in a second database, comprising:

  • requesting and, in response to said request, receiving said plurality of web pages related to one or more products from said first remote website;

    automatically identifying a first cluster of web pages associated with said first remote web site that comprises product information from said first database, said product information displayed as data field values (DFVs) related to the one or more products and exhibiting common identification characteristics in each web page of said first cluster of web pages;

    structurally comparing a first sample of web pages within said first cluster of web pages and creating an intersection data structure comprising the structural location of said DFVs of database records (DBRs), and inferring data field names (DFNs) associated with keywords, symbols, or patterns, indicating the location of said DFVs in said intersection data structure;

    automatically deriving a first extraction template from said intersection data structure associated with said first sample of web pages within said first cluster of web pages;

    utilizing said first extraction template for extracting said DFVs from said first cluster of web pages to create, in association with said inferred DFNs, extracted DBRs;

    and storing said extracted DBRs in said second database.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×