Web crawling
First Claim
Patent Images
1. A method for crawling for resources in a network, the method comprising:
- receiving a list of resources on the network and for at least one of the resources on the list of resources, sending a first request to a server in the network for the resource using a first browser state, and sending a second request for the same resource using a second browser state.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is directed to mechanisms for improving the “crawling” of resources on a network, which takes into account the notion of browser state. An improved indexing scheme for the crawled results and improved search mechanisms are also disclosed.
-
Citations
34 Claims
-
1. A method for crawling for resources in a network, the method comprising:
receiving a list of resources on the network and for at least one of the resources on the list of resources, sending a first request to a server in the network for the resource using a first browser state, and sending a second request for the same resource using a second browser state. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A method for processing crawled resources in a network, the method comprising:
-
receiving a resource in response to a request for the resource using one of a plurality of browser states;
storing the resource; and
indexing the resource, the indexing step further comprising the step of associating the resource with a first browser state where the first browser state is the one of the plurality of browser states used to request the resource. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method for searching a database of crawled resources, the method comprising the steps of:
-
receiving a search query from a browser client;
detecting a browser state for the browser client; and
searching for results from the database of resource using both the search query and the browser state of the browser client. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A computer-readable medium comprising one or more instructions which when executed perform the following:
receiving a list of resources on the network and for at least one of the resources on the list of resources, sending a first request to a server in the network for a resource using a first browser state, and sending a second request for the same resource using a second browser state. - View Dependent Claims (22, 23)
-
24. A computer-readable medium comprising one or more instructions which when executed perform the following:
-
receiving a resource in response to a request for the resource using one of a plurality of browser states;
storing the resource; and
indexing the resource, the indexing step further comprising the step of associating the resource with a first browser state where the first browser state is the one of the plurality of browser states used to request the resource. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A computer-readable medium comprising one or more instructions which when executed perform the following:
-
receiving a search query from a browser client;
detecting a browser state for the browser client; and
searching for results from the database of resource using both the search query and the browser state of the browser client. - View Dependent Claims (30, 31, 32, 33, 34)
-
Specification