Page information collection program, page information collection method, and page information collection apparatus
First Claim
1. A computer-readable storage medium having recorded thereon a page information collection program for collecting a set of pages associated by link information from a server on a network, the page information collection program causing a computer to execute the processing of:
- acquiring contents of a page through the network in response to a page acquisition request and creating page information including the contents of the page and a response status code used for page acquisition, the creating the page information being automatically performed without user interaction;
taking the page information created as target page information, comparing an assignment determination condition defining the requirements of page information to be included in each group and the target page information, to find a group having the assignment determination condition satisfied by the target page information, and storing the target page information put into the group in a storage block, the comparing the assignment determination condition being automatically performed without user interaction;
creating an assignment determination condition satisfied by the target page information if the target page information does not satisfy the assignment determination condition of any group, creating a group corresponding to the created assignment determination condition, and storing the target page information put into the created group in the storage block the creating the assignment determination condition and the creating the group being automatically performed without user interaction; and
extracting the link information only from the target page information put first into the group created and outputting the page acquisition request for acquiring the page based on the extracted link information, the extracting the link information and the outputting the page acquisition request being automatically performed without user interaction, wherein the link information is not extracted from the target page information put into already existing groups;
wherein the assignment determination condition includes a URL and an option to select a query conformity field;
wherein the assignment determination condition defines a requirement concerning the conformity of the page acquisition request and the response status code given when the contents of the page are acquired, the assignment determination condition is specific to each group, and contents of the assignment determination condition are individually changeable; and
wherein in the comparing the assignment determination condition and the target page information, page information acquired by different page acquisition requests and having different response status codes is determined to belong to different groups.
1 Assignment
0 Petitions
Accused Products
Abstract
A page information collection program for efficiently collecting pages required to verify a web site. When a page acquisition request is input, a page acquisition section acquires the contents of a page and creates page information including the contents of the page and communication information used to acquire the page. Next, a classification section stores the page information put into a group in accordance with an assignment determination condition. If the target page information does not satisfy the assignment determination condition of any group, a grouping section creates an assignment determination condition satisfied by the target page information and a corresponding group, and stores the page information put into the created group. A page acquisition request section outputs a page acquisition request based on the link information in the page information put into the group created by the grouping section, to the page acquisition section.
14 Citations
11 Claims
-
1. A computer-readable storage medium having recorded thereon a page information collection program for collecting a set of pages associated by link information from a server on a network, the page information collection program causing a computer to execute the processing of:
-
acquiring contents of a page through the network in response to a page acquisition request and creating page information including the contents of the page and a response status code used for page acquisition, the creating the page information being automatically performed without user interaction; taking the page information created as target page information, comparing an assignment determination condition defining the requirements of page information to be included in each group and the target page information, to find a group having the assignment determination condition satisfied by the target page information, and storing the target page information put into the group in a storage block, the comparing the assignment determination condition being automatically performed without user interaction; creating an assignment determination condition satisfied by the target page information if the target page information does not satisfy the assignment determination condition of any group, creating a group corresponding to the created assignment determination condition, and storing the target page information put into the created group in the storage block the creating the assignment determination condition and the creating the group being automatically performed without user interaction; and extracting the link information only from the target page information put first into the group created and outputting the page acquisition request for acquiring the page based on the extracted link information, the extracting the link information and the outputting the page acquisition request being automatically performed without user interaction, wherein the link information is not extracted from the target page information put into already existing groups; wherein the assignment determination condition includes a URL and an option to select a query conformity field; wherein the assignment determination condition defines a requirement concerning the conformity of the page acquisition request and the response status code given when the contents of the page are acquired, the assignment determination condition is specific to each group, and contents of the assignment determination condition are individually changeable; and wherein in the comparing the assignment determination condition and the target page information, page information acquired by different page acquisition requests and having different response status codes is determined to belong to different groups. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A page information collection method for collecting a set of pages associated by link information, from a server on a network, the page information collection method comprising:
-
acquiring contents of a page through the network in response to a page acquisition request and creating page information including the contents of the page and a response status code used for page acquisition, the creating the page information being automatically performed without user interaction; taking the page information created as target page information, comparing an assignment determination condition defining the requirements of page information to be included in each group and the target page information, to find a group having the assignment determination condition satisfied by the target page information, and storing the target page information put into the group in a storage block, the comparing the assignment determination condition being automatically performed without user interaction; creating an assignment determination condition satisfied by the target page information if the target page information does not satisfy the assignment determination condition of any group, creating a group corresponding to the created assignment determination condition, and storing the target page information put into the created group in the storage block, the creating the assignment determination condition and the creating the group being automatically performed without user interaction; and extracting the link information only from the target page information put first into the group created and outputting the page acquisition request for acquiring the page based on the extracted link information, the extracting the link information and the outputting the page acquisition request being automatically performed without user interaction, wherein the link information is not extracted from the target page information put into already existing groups; wherein the assignment determination condition includes a URL and an option to select a query conformity field; wherein the assignment determination condition defines a requirement concerning the conformity of the page acquisition request and the response status code given when the contents of the page are acquired, the assignment determination condition is specific to each group, and contents of the assignment determination condition are individually changeable; and wherein in the comparing the assignment determination condition and the target page information, page information acquired by different page acquisition requests and having different response status codes is determined to belong to different groups.
-
-
11. A page information collection apparatus for collecting a set of pages associated by link information, from a server on a network, the page information collection apparatus comprising:
-
a page acquisition means for acquiring the contents of a page through the network in response to a page acquisition request and creating page information including contents of the page and a response status code used for page acquisition, the creating the page information being automatically performed without user interaction; a classification means for taking the page information created by the page acquisition means as target page information, comparing an assignment determination condition defining the requirements of page information to be included in each group and the target page information, to find a group having the assignment determination condition satisfied by the target page information, and storing the target page information put into the group in a storage means, the comparing the assignment determination condition being automatically performed without user interaction; a grouping means for creating an assignment determination condition satisfied by the target page information if the target page information does not satisfy the assignment determination condition of any group, creating a group corresponding to the created assignment determination condition, and storing the target page information put into the created group in the storage means, the creating assignment determination condition and the creating the group being automatically performed without user interaction; and a page acquisition request means for extracting the link information only from the target page information put first into the group created by the grouping means and outputting the page acquisition request for acquiring the page based on the extracted link information to the page acquisition means, the extracting the link information and the outputting the page acquisition request being automatically performed without user interaction, wherein the link information is not extracted from the target page information put into already existing groups; wherein the assignment determination condition includes a URL and an option to select a query conformity field; wherein the assignment determination condition defines a requirement concerning the conformity of the page acquisition request and the response status code given when the contents of the page are acquired, the assignment determination condition is specific to each group, and contents of the assignment determination condition are individually changeable; and wherein in the comparing the assignment determination condition and the target page information, page information acquired by different page acquisition requests and having different response status codes is determined to belong to different groups.
-
Specification