×

USING STRUCTURED DATABASE FOR WEBPAGE INFORMATION EXTRACTION

  • US 20080281827A1
  • Filed: 05/10/2007
  • Published: 11/13/2008
  • Est. Priority Date: 05/10/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-implemented method of obtaining webpage training samples, the method comprising:

  • accessing a structured database having a plurality of entries, wherein each entry comprises a plurality of fields, one of the fields comprising a URL (uniform resource locater) and another one of the fields comprising first information at least similar to second information to be located in a webpage associated with the URL; and

    for each of the plurality of entries in the structured database, retrieving a webpage associated with the URL; and

    analyzing the webpage to find the second information therein corresponding to the first information in the structured database, and if the second information is found in the webpage storing information indicative of the webpage as a training sample.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×