×

Interactive web crawler

  • US 8,538,949 B2
  • Filed: 06/17/2011
  • Issued: 09/17/2013
  • Est. Priority Date: 06/17/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method of web crawling hidden files, comprising:

  • loading a web page with a browser agent;

    executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values;

    retrieving a list of form controls from the web page using the browser agent;

    analyzing the form controls using a driver component of a crawler;

    sending form control values from the driver component to the browser agent;

    submitting an event to the web page by the browser agent or running any scripted content to trigger operations on the web page corresponding to the form control values;

    generating a URL for various form control values using a generalizer; and

    re-fetching, using the browser agent, web page content, a new list of form controls, and corresponding values for a new control that is dependent upon one of the form controls, wherein the browser agent re-fetches until all form controls are executed.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×