ADAPTIVE GATHERING OF STRUCTURED AND UNSTRUCTURED DATA SYSTEM AND METHOD
First Claim
1. A computer implement method of obtaining information from a webserver, the method comprising:
- obtaining a first URI from a prioritized URI queue;
utilizing the first URI at a first URI access time to request first content from the webserver;
parsing the first content a first time for first price and product information and saving the result as a first parse result;
utilizing the first URI at a second URI access time to request second content from the webserver;
parsing the second content for second price and product information, and saving the result as a second parse result; and
determining that the first parse result is different than the second parse result and setting a time for accessing the first URI in the prioritized URI queue based on the difference.
3 Assignments
0 Petitions
Accused Products
Abstract
Content is obtained from a webpage accessed via a URI, which URI is obtained from a URI queue. The content is parsed for price and product information according to a parse map, with the resulting parse result being stored. The priority of URIs in the URI queue is adjusted based on analysis of the parse result for changes in price and product attributes and according to other criteria. The parse map may be one associated with the URI or a general purpose parse maps. The parse result may be validated by human- and machine-based systems, including by graphically labeling price and product information in the content for human confirmation or correction.
6 Citations
1 Claim
-
1. A computer implement method of obtaining information from a webserver, the method comprising:
-
obtaining a first URI from a prioritized URI queue; utilizing the first URI at a first URI access time to request first content from the webserver; parsing the first content a first time for first price and product information and saving the result as a first parse result; utilizing the first URI at a second URI access time to request second content from the webserver; parsing the second content for second price and product information, and saving the result as a second parse result; and determining that the first parse result is different than the second parse result and setting a time for accessing the first URI in the prioritized URI queue based on the difference.
-
Specification