Information presentation apparatus with meta-information management function
First Claim
1. A web page server comprising:
- a document storage unit that stores web pages;
a meta-information table that stores information including a status of each web page stored in the document storage unit;
an information collection table that stores names of web robots that monitor the content of the web page server; and
management software that provides the following functions;
monitoring modification of the web pages in the document storage unit;
updating the meta-information table based on the monitoring function;
creating, in response to a collection request by the web robots, a list of web pages in the document storage that have been modified since a previous collection request so that the collection request only retrieves previously unretrieved documents, thereby reducing the load on the web page server; and
retrieving the names of web robots from the information collection table when the monitoring function detects modification of web pages and transmitting a message to each web robot indicating that the web robot should issue a collection request.
0 Assignments
0 Petitions
Accused Products
Abstract
A high-speed information collection utilizing a meta-information management unit which manages update information pertaining to documents which are retrieved by web robots. An information presentation apparatus, an information collection apparatus including a web robot, and a client are connected via a network. The information presentation apparatus presents document information stored in a document storage unit to an information collection apparatus. The information presentation apparatus has a meta-information management unit, which generates update information pertaining to individual documents, and a meta-information table which records the update information. When the web robot makes a collection request to the information presentation apparatus, the meta-information management unit references the meta-information table and generates a list of updated documents from all of the stored documents and/or collection targets, and presents the list to the web which subsequently retrieves only the updated documents.
13 Citations
16 Claims
-
1. A web page server comprising:
-
a document storage unit that stores web pages;
a meta-information table that stores information including a status of each web page stored in the document storage unit;
an information collection table that stores names of web robots that monitor the content of the web page server; and
management software that provides the following functions;
monitoring modification of the web pages in the document storage unit;
updating the meta-information table based on the monitoring function;
creating, in response to a collection request by the web robots, a list of web pages in the document storage that have been modified since a previous collection request so that the collection request only retrieves previously unretrieved documents, thereby reducing the load on the web page server; and
retrieving the names of web robots from the information collection table when the monitoring function detects modification of web pages and transmitting a message to each web robot indicating that the web robot should issue a collection request. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An information system comprising:
-
a client that requests information including web pages;
an information collection apparatus that uses a web robot to retrieve web pages on a web page server, indexes the retrieved web pages, and provides a search facility for the client; and
a web page server comprising;
a document storage unit that stores web pages for access by the client and the web robot;
a meta-information table that stores information including a status of each web page stored in the document storage unit; and
managing software that provides the following functions;
monitoring modification of the web pages in the document storage unit;
updating the meta-information table based on the monitoring function; and
creating, in response to a collection request from the web robot, a list of web pages in the document storage that have been modified since a previous collection request so that the collection request only retrieves previously unretrieved documents, thereby reducing the load on the web page server.
-
-
9. A method of operating a web page server, comprising:
-
storing web pages;
storing status information of each stored web page;
storing names of web robots that monitor the content of the web page server;
monitoring modification of the stored web pages;
updating the status information based on the monitoring of the modification of the stored web pages;
creating, in response to a collection request by one of the web robots, a list of web pages in the document storage that have been modified since a previous collection request so that the collection request only retrieves previously unretrieved documents, thereby reducing the load on the web page server; and
retrieving the names of web robots when the monitoring function detects modification of web pages and transmitting a message to each web robot indicating that the web robot should issue a collection request. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method of collecting information for a client from a web page server using a web robot, the method comprising:
-
storing web pages in the web page server for access by the client and the web robot;
storing information including a status of each web page stored in the web page server;
collecting information using a web robot to retrieve web pages on the web page server;
indexing the retrieved web pages to provide a search facility for the client;
providing an information request from the client to the web robot, the information request including a request for web pages;
monitoring modification of the web pages stored in the web page server;
maintaining modification information based on the monitoring of the modification of the stored web pages; and
creating, in response to a collection request from the web robot, a list of web pages stored in the web page server that have been modified since a previous collection request so that the collection request only retrieves previously unretrieved documents, thereby reducing the load on the web page server.
-
Specification