×

Method and system for crawling, mapping and extracting information associated with a business using heuristic and semantic analysis

  • US 20090119268A1
  • Filed: 11/04/2008
  • Published: 05/07/2009
  • Est. Priority Date: 11/05/2007
  • Status: Active Grant
First Claim
Patent Images

1. A system for crawling, mapping, extracting and associating information on a web page with a known business, the system comprising:

  • means for identifying the website address of a web page to be crawled;

    means for downloading and storing the contents of said web page;

    means for evaluating the stored content and identifying a unique identification symbol within said stored content and associated with a business referenced on said web page;

    means for transforming the stored content to a normalize form;

    means for identifying one or more potential businesses that may be associated with said normalized content;

    means for confidently selecting one business, from among said one or more potential businesses, to be associated with said normalized content;

    means for mapping said unique identification symbol with said confidently selected business;

    means for extracting one or more elements from said stored contents of said web page in accordance with an extraction template formed in accordance with the structure of said web page, said elements comprising data about the business referenced on said web page;

    means for associating said extracted elements about the business referenced on said web page with the confidently selected one business; and

    means for publishing the results of said association of said extracted elements with said confidently selected one business.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×