×

System and method for enhanced browser-based web crawling

  • US 7,519,902 B1
  • Filed: 06/30/2000
  • Issued: 04/14/2009
  • Est. Priority Date: 06/30/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for indexing dynamic data documents, the method comprising:

  • retrieving, to a server, with a web crawler from a network address, a dynamic data document with client-side scripting code therein;

    executing, at the server, a web-browser, as part of the web crawler, wherein the web-browser renders an in-memory copy of the dynamic data document which has been retrieved, wherein the in-memory copy of the dynamic data document maintains a rendered web-browser display format and a rendered web-browser display layout of the dynamic data document when the web-browser renders the in-memory copy of the dynamic data document;

    executing, at the server instead of a client system, a browser scripting engine as part of the web-browser, wherein the browser scripting engine executes the client-side scripting code and loads content as directed by the client-side scripting code into the in-memory copy creating a final web-browser display representation of the dynamic data document so that the final web-browser display representation is substantially similar to when the dynamic data document is rendered at a user'"'"'s web-browser and viewed by a user in the user'"'"'s web-browser running on the client system when all the dynamic data is viewed; and

    indexing, at the server, the content in the memory, wherein the content being indexed is the content which has been loaded by the browser scripting engine in order to index the dynamic data document as if being viewed by the user in the user'"'"'s web-browser on the client system.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×