System and methodology for extraction and aggregation of data from dynamic content
First Claim
1. A method for extracting and structuring items of data from content available via the Internet, the method comprising:
- receiving input of a user specifying at least one source of content available via the Internet, types of data to be extracted from said at least one source, and fields for structuring extracted items of data;
retrieving content from said at least one source;
parsing the retrieved content to extract items of data of the types specified by the user;
mapping the extracted items of data to the fields specified by the user so as to transform the extracted items of data into a structured format;
generating a feature tag for each extracted item of data, the feature tag identifying attributes of the item of data and the structured format of the item; and
in response to a subsequent request for an item of data, using the feature tag to obtain the item of data and transform it into the structured format.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and methodology for extraction and aggregation of data from dynamic content is described. In one embodiment, for example, a method is described for extracting and structuring items of data from content available via the Internet, the method comprises steps of: receiving input of a user specifying at least one source of content available via the Internet, types of data to be extracted from the at least one source, and fields for structuring extracted items of data; retrieving content from the at least one source; parsing the retrieved content to extract items of data of the types specified by the user; and mapping the extracted items of data to the fields specified by the user so as to transform the extracted items of data into a structured format.
184 Citations
63 Claims
-
1. A method for extracting and structuring items of data from content available via the Internet, the method comprising:
-
receiving input of a user specifying at least one source of content available via the Internet, types of data to be extracted from said at least one source, and fields for structuring extracted items of data; retrieving content from said at least one source; parsing the retrieved content to extract items of data of the types specified by the user; mapping the extracted items of data to the fields specified by the user so as to transform the extracted items of data into a structured format; generating a feature tag for each extracted item of data, the feature tag identifying attributes of the item of data and the structured format of the item; and in response to a subsequent request for an item of data, using the feature tag to obtain the item of data and transform it into the structured format. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method facilitating retrieval of an item of dynamic content available via the Internet, the method comprising:
-
receiving input of a user specifying an item of dynamic content available from a source of dynamic content available via the Internet and a format for the item of dynamic content; generating a feature tag for an item of dynamic content, the feature tag including a plurality of characters with each character in the plurality of characters indicating an attribute of the item; in response to a subsequent request for retrieval of the item, parsing the feature tag to identify a plurality of attributes of the item; retrieving the item of dynamic content from the source of dynamic content based upon the plurality of attributes of the item; and transforming the item of dynamic content into the format specified by the user. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
-
44. A system for retrieving a plurality of web objects from a plurality of source pages for presentation to a web client, the system comprising:
-
a plurality of web objects at least partially encoded in one or more markup languages available from a plurality of source pages available on a network; a web client having access to the plurality of source pages via the network; a module for receiving input of user specifying web objects to be included in a web page presented to the web client and a format for the web objects; a module for generating feature tags for the specified web objects, each feature tags for identifying a particular web object; and at least one content server having access to the network for retrieving web objects available on the plurality of source pages based on the feature tags and presenting the retrieved web objects to the web client in a web page in the format specified by the user. - View Dependent Claims (45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
-
Specification