Parallel random proxy usage for large scale web access
First Claim
1. A machine readable medium containing configuration instructions for performing a method for retrieving data accessible by posing a plurality of queries over a network to at least one target server, the method comprising the steps of:
- transmitting a first one of the plurality of queries to a first one of a plurality of proxy server services for transmission to one of the at least one target servers;
for each one of the plurality of queries, receiving from its corresponding proxy server service a reply from its corresponding target server, each of said replies comprising data which is at least part of said information; and
constructing a database view of the information using said data received from the proxy server services in reply to said plurality of queries.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method efficiently and anonymously retrieves large scale Web data through a restricted query interface. A number of proxy servers are utilized to permit parallel access to a target Web server for processing multiple queries simultaneously. Latency in the individual queries is absorbed by the proxy servers. Queries that would otherwise appear structured to the target server are assigned to the proxy server in a random fashion, obscuring the structured nature of the queries. The anonymous nature of the queries made by the proxy servers furthermore conceals the identity of the originating server.
12 Citations
20 Claims
-
1. A machine readable medium containing configuration instructions for performing a method for retrieving data accessible by posing a plurality of queries over a network to at least one target server, the method comprising the steps of:
-
transmitting a first one of the plurality of queries to a first one of a plurality of proxy server services for transmission to one of the at least one target servers; for each one of the plurality of queries, receiving from its corresponding proxy server service a reply from its corresponding target server, each of said replies comprising data which is at least part of said information; and constructing a database view of the information using said data received from the proxy server services in reply to said plurality of queries. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A machine readable medium containing configuration instruction for performing a method for retrieving information accessible by posing a plurality of queries over a network to at least one target server, wherein the plurality of queries are posed through a restricted interface that returns the k data points closest to a query point, the method comprising the steps of:
-
calculating a maximum radius for a previous query point to a data point returned by the corresponding query; determining a region within R covered by the previous query based on the corresponding maximum radius; computing quadtrees of progressively greater levels until a computed quadtree has an uncovered node entirely outside the covered region; and constructing a subsequent query to contain a query point that is a center of the uncovered quadtree node. - View Dependent Claims (12)
-
-
13. A method for retrieving data accessible by posing a plurality of queries over a network to at least one target server, the method comprising the steps of:
-
transmitting a first one of the plurality of queries to a first one of a plurality of proxy server services for transmission to one of the at least one target servers; for each one of the plurality of queries, receiving from its corresponding proxy server service a reply from its corresponding target server, each of said replies comprising data which is at least part of said information; and constructing a database view of the information using said data received from the proxy server services in reply to said plurality of queries. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification