Method and system for gathering information resident on global computer networks
First Claim
1. A computer system for gathering information from network resources on a global computer network, comprising:
- a database to store resource identifiers that correspond to particular network resources, and search items that define a search for information and specify at least one of the network resources;
multiple computers connected over a network in communication with the database;
a system executive to query a database manager for a list of all computers in the distributed computer system;
the system executive to determine a number of pages for each resource defined by a resource identifier;
the system executive to employ the determined number of pages for each resource to assign an average number of pages to retrieve per day to each computer in the list; and
the system executive to dynamically update said average number of pages to retrieve per day for each of the multiple computers based upon an update of said network resources.
13 Assignments
0 Petitions
Accused Products
Abstract
A method and system for confidentially accessing and reporting information present on global computer networks. The present invention deterministically analyzes a set of network resources over a configurable monitoring period, thereby guaranteeing that recently published information is retrieved. The present invention includes a scalable software system that can be readily executed on a stand-alone computing system or distributed across a network of computing devices. At the end of each monitoring period, the present invention balances the traversal and searching of network resources across the computing devices in the distributed system according to the previous number of pages retrieved for each network resources, thereby more accurately balancing the system.
21 Citations
17 Claims
-
1. A computer system for gathering information from network resources on a global computer network, comprising:
-
a database to store resource identifiers that correspond to particular network resources, and search items that define a search for information and specify at least one of the network resources; multiple computers connected over a network in communication with the database; a system executive to query a database manager for a list of all computers in the distributed computer system; the system executive to determine a number of pages for each resource defined by a resource identifier; the system executive to employ the determined number of pages for each resource to assign an average number of pages to retrieve per day to each computer in the list; and the system executive to dynamically update said average number of pages to retrieve per day for each of the multiple computers based upon an update of said network resources. - View Dependent Claims (2, 3)
-
-
4. A computer implemented method for gathering information from network resources on a global computer network, comprising:
-
multiple computers connected over a network in communication with a database; storing resource identifiers in the database, the identifiers corresponding to particular network resources, and search items that define a search for information and specify at least one of the network resources; querying a database manager for a list of all computers in the distributed computer system; determining a number of pages for each resource defined by a resource identifier; employing the determined number of pages for each resource to assign an average number of pages to retrieve per day to each computer in the list; and dynamically updating said average number of pages to retrieve per day for each of the multiple computers based upon an update of said network resources. - View Dependent Claims (5, 6)
-
-
7. A computer system for gathering information from network resources on a global computer network, the system comprising:
-
a database server to store resource identifiers that correspond to particular network resources, and search items that define a search for information and specify at least one of the network resources; multiple computers connected over a network and in communication with the database server, including computers designated as collection nodes and computers designated as search nodes; a system executive to invoke at least one collection controller local to a collection node, said collection controller to traverse network resources assigned to said node and to pass retrieved informative items to a token queue of one of the search nodes; and the system executive to invoke at least one search controller local to a search node, said search controller to search tokens in the token queue and to remove a token from the token queue that duplicates a prior token in the token queue. - View Dependent Claims (8, 9, 10)
-
-
11. A computer implemented method for gathering information from network resources on a global computer network, the method comprising:
-
a database server for storing resource identifiers corresponding to particular network resources, and search items that define a search for information and specify at least one of the network resources; multiple computers connected over a network and in communication with the database server, including computers designated as collection nodes and computers designated as search nodes; invoking at least one collection controller local to a collection node, said collection controller for traversing network resources assigned to said node and passing retrieved informative items to a token queue of one of the search nodes; and invoking at least one search controller local to a search node, said search controller to search tokens in the token queue and to remove a token from the token queue that duplicates a prior token in the token queue. - View Dependent Claims (12, 13, 14)
-
-
15. A computer-implemented method for gathering information from network resources on a global computer network, comprising:
-
assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period; categorizing the network resources; generating search items via a system executive, each of the search items defining a search for particular information and designating at least one of the categorized network resources; identifying, at search time, the network resources that have been assigned the given search time and categorized; retrieving and storing information from the identified network resources; and performing at least one search defined by at least one search item on stored information. - View Dependent Claims (16)
-
-
17. A method for gathering information from network resources on a global computer network, the method comprising:
-
categorizing the network resources; generating a set of search items via a system executive, each of the search items defining a search for particular information and designating at least one of the categorized network resources; retrieving and storing information from the categorized network resources designated by at least one of the search items; performing a search defined by the search items on stored information; and presenting results of the search.
-
Specification