×

System and methodology for extraction and aggregation of data from dynamic content

  • US 8,060,518 B2
  • Filed: 06/26/2007
  • Issued: 11/15/2011
  • Est. Priority Date: 02/08/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for extracting and structuring items of data from content available via the Internet, the method comprising:

  • receiving at a server information from a user'"'"'s computer specifying at least (a) a Web page available via the Internet, (b) types of data to be extracted from the Web page, and (c) fields for structuring extracted items of data;

    retrieving the Web page from the Internet yielding retrieved content;

    processing the retrieved content including;

    (a) parsing one or more container objects in the retrieved content, wherein the container objects correspond to the types of data, and(b) creating feature tags based on content of respective ones of the container objects, and using at least one of the feature tags to extract data of the types specified according to the information received at the server yielding extracted data;

    mapping the extracted data to the fields so as to transform the extracted data into a structured format yielding structured data; and

    returning aspects of the structured data to the user'"'"'s computer,wherein the feature tags each comprise a series of alphanumeric characters that represent (i) a container object type, (ii) a version of the container object, and (iii) an occurrence of the container object within the retrieved content.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×