Methods for analyzing dynamic web pages
First Claim
Patent Images
1. A method comprising:
- loading a web page;
identifying an extractor template for extracting information from the web page, wherein the extractor template comprises timing instructions for extracting information from the web page;
identifying a first object loaded in a web page;
simulating user input that manipulates the identified first object;
identifying, by at least one processor, a video loaded in response to the simulated user input;
extracting first information from the video;
in accordance with the timing instructions of the extractor template, skipping a commercial associated with the video;
continuing, in accordance with the timing instructions of the extractor template, to extract information from the video by extracting second information from the video after skipping the commercial associated with the video; and
aggregating the first information from the video and the second information from the video in an index.
3 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method is provided for searching for files on the Internet. In one embodiment, the method may provide an application crawler that assembles and dynamically instantiates all components of a web page. The instantiated web application may then be analyzed to locate desired components on the web page. This may involve finding and analyzing all clickable items in the application, driving the web application by injecting events, and extracting information from the application and writing it to a file or database.
48 Citations
23 Claims
-
1. A method comprising:
-
loading a web page; identifying an extractor template for extracting information from the web page, wherein the extractor template comprises timing instructions for extracting information from the web page; identifying a first object loaded in a web page; simulating user input that manipulates the identified first object; identifying, by at least one processor, a video loaded in response to the simulated user input; extracting first information from the video; in accordance with the timing instructions of the extractor template, skipping a commercial associated with the video; continuing, in accordance with the timing instructions of the extractor template, to extract information from the video by extracting second information from the video after skipping the commercial associated with the video; and aggregating the first information from the video and the second information from the video in an index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium including a set of instructions that, when executed, cause at least one processor to perform the steps comprising:
-
loading a web page; identifying an extractor template for extracting information from the web page, wherein the extractor template comprises timing instructions for extracting information from the web page; identifying a first object loaded in a web page; simulating user input that manipulates the identified first object; identifying a video loaded in response to the simulated user input; extracting first information from the video; in accordance with the timing instructions of the extractor template, skipping a commercial associated with the video; continuing, in accordance with the timing instructions of the extractor template, to extract information from the video by extracting second information from the video after skipping the commercial associated with the video; and aggregating the first information about the video and the second information about the video in an index. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A method comprising:
-
monitoring execution of a web page; identifying an extractor template for extracting information from the web page, wherein the extractor template comprises timing instructions for extracting information from the web page; applying, by at least one processor, the extractor template to the web page to identify a first object loaded in the web page; simulating user input that manipulates the identified first object; identifying, by the at least one processor, a video loaded in response to the simulated user input; extracting first information from the video; in accordance with the timing instructions of the extractor template, skipping a commercial associated with the video; continuing, in accordance with the timing instructions of the extractor template, to extract information from the video after skipping the commercial by extracting second information from the video; and aggregating the first information from the video and the second information from the video in an index. - View Dependent Claims (20, 21, 22, 23)
-
Specification