INTERACTIVE WEB CRAWLER
First Claim
1. A method of web crawling hidden files, comprising:
- loading a web page with a browser agent;
executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values;
retrieving a list of form controls from the web page using the browser agent;
analyzing the controls using a driver component;
sending form control values from the driver component to the browser agent;
submitting an event to the web page by the browser agent or running any scripted content to trigger operations on the web page corresponding to the form control values; and
generating a URL for various form control values using a generalizer.
2 Assignments
0 Petitions
Accused Products
Abstract
The claimed subject matter provides a system or method for web crawling hidden files. An exemplary method comprises loading a web page with a browser agent, and executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values. A list of form controls may be retrieved from the web page using the browser agent, and the controls may be analyzed using a driver component. Form control values may be sent from the driver component to the browser agent, and an event may be submitted to the web page by the browser agent or scripted content may be run to trigger operations on the web page corresponding to the form control values. A URL may be generated for various form control values using a generalizer.
30 Citations
20 Claims
-
1. A method of web crawling hidden files, comprising:
-
loading a web page with a browser agent; executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values; retrieving a list of form controls from the web page using the browser agent; analyzing the controls using a driver component; sending form control values from the driver component to the browser agent; submitting an event to the web page by the browser agent or running any scripted content to trigger operations on the web page corresponding to the form control values; and generating a URL for various form control values using a generalizer. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for web crawling hidden files, the system comprising:
-
a processing unit; and a system memory, wherein the system memory comprises code configured to direct the processing unit to; load a web page via a browser agent module; execute any dynamic elements hosted on the web page using the browser agent module to insert pre-determined values; retrieve a list of form controls from the web page using the browser agent module; analyze the controls using a driver component module; send form control values from the driver component module to the browser agent module; and generate a URL for various form control values using a generalize module. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. One or more computer-readable storage media, comprising code configured to direct a processing unit to:
-
load a web page with a browser agent; execute any forms hosted on the web page using the browser agent to insert pre-determined values; retrieve a list of form controls from the web page using the browser agent; analyze the controls using a driver component; send form control values from the driver component to the browser agent; and generate a URL for various form control values using a generalizer. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification