Method and system for gathering information resident on global computer networks
First Claim
1. A computer-implemented method for gathering information from network resources on a global computer network, the method comprising:
- assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period;
categorizing the network resources into industry groups;
generating search items, each of the search items defining a search for particular information and designating one or more of the industry groups;
identifying, at a given search time, the network resources that have been assigned the given search time and categorized into industry groups designated by one or more of the search items;
retrieving and storing information from the identified network resources; and
performing the searches defined by one or more of the search items on the stored information.
13 Assignments
0 Petitions
Accused Products
Abstract
A method and system for confidentially accessing and reporting information present on global computer networks. The present invention deterministically analyzes a set of network resources over a configurable monitoring period, thereby guaranteeing that recently published information is retrieved. The present invention includes a scalable software system that can be readily executed on a stand-alone computing system or distributed across a network of computing devices. At the end of each monitoring-period, the present invention balances the traversal and searching of network resources across the computing devices in the distributed system according to the previous number of pages retrieved for each network resources, thereby more accurately balancing the system. Furthermore, in order to reduce system resource requirements, the present invention searches only those network resources that are targeted either individually or as a industry. In addition, the present invention further conserves computing resources by not searching documents or files that have already matched search criteria and have remained unchanged.
-
Citations
14 Claims
-
1. A computer-implemented method for gathering information from network resources on a global computer network, the method comprising:
-
assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period;
categorizing the network resources into industry groups;
generating search items, each of the search items defining a search for particular information and designating one or more of the industry groups;
identifying, at a given search time, the network resources that have been assigned the given search time and categorized into industry groups designated by one or more of the search items;
retrieving and storing information from the identified network resources; and
performing the searches defined by one or more of the search items on the stored information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
performing the identification and retrieval and storage operations periodically;
analyzing portions of the newly stored information for changes relative to corresponding portions of the information retrieved in a previous retrieval and storage operation; and
performing the searches defined by one or more of the search items on only the portion of the newly stored information that has changed.
-
-
7. The method of claim 6, wherein:
-
the analysis operation includes calculating a checksum for portions of the newly stored information and determining whether the checksum has changed relative to a checksum previously calculated for a corresponding portion of the information retrieved in the previous retrieval and storage operation, and the search performance operation includes performing the searches defined by one or more of the search items on only the portion of the newly stored information for which the checksum has changed.
-
-
8. The method of claim 1, further comprising:
-
storing the search items and resource identifiers that identify network resources in a database;
retrieving from the database, at a given search time, the search items and resource identifiers corresponding to the network resources that have been assigned the given search time;
identifying, at the given search time, the retrieved resource identifiers corresponding to the network resources that have been designated by one or more of the search items;
retrieving and storing information from the network resources corresponding to the retrieved resource identifiers that have been identified; and
performing the searches defined by one or more of the search items on the stored information.
-
-
9. A method for gathering information from network resources on a global computer network, the method comprising:
-
assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period;
generating search items, each of the search items defining a search for particular information and designating one or more of the network resources;
identifying, at a given one of the search times, the network resources that have been assigned the given search time and which are designated by one or more of the search items;
retrieving and storing information from the identified network resources, whereby information from the network resources that have not been assigned the given search time or are not designated by one or more of the search items is not retrieved and stored; and
performing the searches defined by one or more of the search items on the stored information.
-
-
10. A method for gathering information from network resources on a global computer network, the method comprising:
-
generating a set of search items, each of the search items defining a search for particular information and designating one or more of the network resources;
retrieving and storing information from the network resources designated by one or more of the search items;
performing the searches defined by one or more of the search items on the stored information; and
presenting results of the searches.
-
-
11. A method for gathering information from network resources on a global computer network, the method comprising:
-
categorizing the network resources into industry groups;
generating a set of search items, each of the search items defining a search for particular information and designating one or more of the industry groups;
retrieving and storing information from the network resources associated with the industry groups designated by one or more of the search items;
performing the searches defined by the search items on the stored information; and
presenting results of the searches.
-
-
12. A method for gathering information from network resources on a global computer network, the method comprising:
-
selecting a set of network resources residing on the global computer network;
assigning a search time to each of the network resources, the search time indicating a time within a monitoring period in which the network resource is to be searched;
generating a set of search items, each of the search items defining parameters for a search and designating one or more of the network resources to be searched;
determining, at approximately the search time for each of the network resources, whether the respective network resource is designated for searching by at least one of the search items;
retrieving and storing information from the network resources designated by at least one of the search items;
performing the searches defined by the search items on the stored information; and
presenting results of the searches to users.
-
-
13. A software system for monitoring network resources residing on a global computer network over a time interval, the system comprising:
-
a database storing resource identifiers that correspond to particular network resources, and search items that define a search for information and specify one or more of the network resources;
a system executive that constructs a set of the resource identifiers scheduled to be searched, and a set of the search items specifying at least one of the network resources corresponding to one of the resource identifiers of the constructed resource identifier set;
a collection controller, for each of the resource identifiers of the constructed set of resource identifiers, the collection controller retrieving information presented by the networked resource corresponding to the resource identifier;
a search controller for receiving the information retrieved by each of the collection controllers; and
a search instance, for each search item of the search item list, wherein the search controller instantiates each search instance to perform the search defined by the respective search item on the information received from the collection controllers for the network resource specified by the respective search item.
-
-
14. A method for monitoring information presented by at least one of a plurality of networked computers comprising:
-
storing a plurality of identifiers, wherein each identifier corresponds to one of the plurality of networked computers;
storing a plurality of search items, wherein each search item includes search criteria and at least one networked computer to be monitored;
generating a set of identifiers to be searched;
generating a set of search items monitoring at least one of the networked computers corresponding to one of the identifiers of the identifier set;
retrieving information presented by each of the networked computers corresponding to an identifier of the identifier set; and
searching the retrieved information according the search criteria of each search item of the search item set monitoring the networked computer corresponding to the retrieved information.
-
Specification