Method and apparatus for an application crawler
First Claim
Patent Images
1. A computer-implemented method comprising:
- crawling and indexing an object model of multiple running, instantiated documents or applications.
5 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method is provided for searching for files on the Internet. In one embodiment, the method may provide an application crawler that assembles and dynamically instantiates all components of a web page. The instantiated web application may then be analyzed to locate desired components on the web page. This may involve finding and analyzing all clickable items in the application, driving the web application by injecting events, and extracting information from the application and writing it to a file or database.
189 Citations
132 Claims
-
1. A computer-implemented method comprising:
crawling and indexing an object model of multiple running, instantiated documents or applications. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
31. A computer-implemented method for searching for video files on a computer network, the method comprising:
crawling and indexing an object model of multiple running, instantiated documents or applications to locate video files. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60)
-
61. A computer-implemented method for creating a searchable database, the method comprising:
-
crawling an object model of multiple running, instantiated documents or applications to locate video files;
indexing video files found in the object model by saving pointers to the video files in the database; and
extracting metadata about the video files from the object model and saving the metadata in the database.
-
-
62. A computer-implemented method for searching for files on the Internet, the method comprising:
-
providing a protocol crawler for identifying video-rich websites; and
providing an application crawler comprising;
an inspector for dynamically instantiating and assembling all components of a web page at one of said video-rich websites to create at least one instantiated web application;
an extractor for identifying specific parts of the instantiated web application that contain useful information and providing the logic required to extract that information into a metadata record; and
a crawler for analyzing the instantiated web application, finding and analyzing all clickable items in the application, driving the web application by injecting events, and extracting information from the application and writing it to a file or database. - View Dependent Claims (63, 64, 65)
-
-
66. A computer-implemented method for searching for files on the Internet, the method comprising:
-
finding a target URL;
downloading the HTML file for the target URL;
downloading supplementary data files used to build the complete web application, based on the information in the HTML file;
assembling application components from said supplementary data files and the HTML file;
instantiating application components to create a web application;
applying data-query interfaces to all objects in the web application that may contain useful data;
loading a pre-defined Application template or generating and automatically-defining a Application template;
applying the Application template to the extract all of the desired information from the web application. saving the desired information to a file or database as a structured-data information record;
examining all components in the web application to identify all possible components that could respond to a mouse event or form a clickable item;
determining which clickable items have appeared since the last simulated mouse event;
storing new clickable items in an appropriate data structure, such as a new branch of a tree containing all clickable items in the application at all possible application states; and
simulating a mouse click on the first clickable item in the current branch of the clickable item tree. - View Dependent Claims (67, 68)
-
-
69. A computer system comprising:
an application crawler having programming code for crawling and indexing an object model of running, instantiated documents or applications from the websites. - View Dependent Claims (70, 71, 72, 73, 74, 75, 76, 77, 78)
-
79. A system comprising:
-
a protocol crawler for identifying video-rich websites; and
an application crawler comprising;
an inspector for dynamically instantiating and assembling all components of a web page at one of said video-rich websites to create at least one instantiated web application;
an extractor for identifying specific parts of the instantiated web application that contain useful information and providing the logic required to extract that information into a metadata record; and
a crawler for analyzing the instantiated web application, finding and analyzing all clickable items in the application, driving the web application by injecting events, and extracting information from the application and writing it to a file or database. - View Dependent Claims (80, 81, 82)
-
-
83. A computer implemented method comprising:
-
receiving a target URL; and
indexing an object model of multiple running, instantiated documents or applications. - View Dependent Claims (84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102)
-
-
103. A computer program product comprising:
-
a computer usable medium and computer readable code embodied on said computer usable medium, the computer readable code comprising;
computer readable program code configured to cause a computer to effect crawling and indexing of an object model of a running, instantiated document or application. - View Dependent Claims (104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132)
-
Specification