DOCUMENT PROCESSING DEVICE AND DOCUMENT PROCESSING METHOD
First Claim
1. A document processing method, comprising:
- a collection step of collecting access history of a user;
a document similarity computing step of computing a document similarity, which indicates similarity between documents, by one user pattern which indicates a plurality of users who have accessed one document and another user pattern which indicates a plurality of users who have accessed another document, according to the access history collected in the collection step;
a keyword weight vector correction step of correcting a keyword weight vector of the one document using the document similarity computed in the document similarity computing step; and
an evaluation value calculation step of calculating an evaluation value for input information for searching, based on the keyword weight vector corrected in the keyword weight vector correction step.
1 Assignment
0 Petitions
Accused Products
Abstract
An object of the present invention is to provide a document processing device and document processing method that can provide a search result satisfactory to a user with respect to WWW documents in which a number of links among WWW documents is low and a number of accesses by users is low. An access pattern collection unit 101 generates an access user vector uj of one WWW document Dj and an access user vector uje of another document Dje. A user similarity computing unit 105 computes a document similarity sim (uj, uje) which indicates a user similarity between the WWW document Dj and WWW document Dje. A keyword vector smoothing unit 106 acquires a smoothed keyword weight vector w′j by correcting a keyword weight vector wj in one document, using the computed document similarity sim (uj, uje). An rearranging unit 110 calculates an evaluation value B_SCORE for input information for searching, based on the smoothed keyword weight vector w′j.
36 Citations
23 Claims
-
1. A document processing method, comprising:
-
a collection step of collecting access history of a user; a document similarity computing step of computing a document similarity, which indicates similarity between documents, by one user pattern which indicates a plurality of users who have accessed one document and another user pattern which indicates a plurality of users who have accessed another document, according to the access history collected in the collection step; a keyword weight vector correction step of correcting a keyword weight vector of the one document using the document similarity computed in the document similarity computing step; and an evaluation value calculation step of calculating an evaluation value for input information for searching, based on the keyword weight vector corrected in the keyword weight vector correction step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 9)
-
-
8. A document processing method, comprising:
-
a collection step of collecting access history of a user; a document similarity computing step of computing a document similarity, which indicates similarity between documents, by one user pattern which indicates a plurality of users who have accessed one document and another user pattern which indicates a plurality of users who have accessed another document, according to the access history collected in the collection step; a keyword weight vector correction step of correcting a keyword weight vector of the one document using the document similarity computed in the document similarity computing step; an acquisition step of acquiring significance information which indicates a significance attached to each document; a significance correction step of distinguishing a first user pattern which indicates users who have accessed one document during a first time period, and a second user pattern which indicates users who have accessed one document during a second time period, according to the accesses history of users collected in the collection step, and correcting the significance of the one document based on the similarity of the first user pattern and the second user pattern and a number of access to the one document; and an evaluation value calculation step of calculating an evaluation value for input information for searching, based on the keyword weight vector corrected in the keyword weight vector correction step, and the significance information corrected in the significance correction step.
-
-
10. A document processing method, comprising:
-
a first generation step of generating a user profile based on a keyword weight vector that is to be a reference value; a second generation step of generating a new keyword weight vector based on the user profile generated in the first generation step and the keyword weight vector that is to be a reference value; a third generation step of generating the new use profile based on the new keyword weight vector generated in the second generation step; a user profile similarity generation step of computing similarity between the new user profile generated in the third generation step and the user profile generated immediately before the new user profile; and an evaluation value calculation step of calculating an evaluation value based on the similarity computed in the user profile similarity generation step, the keyword weight vector and user profile. - View Dependent Claims (11)
-
-
12. A document processing device, comprising:
-
access history collection means for collecting access history of a user; document similarity computing means for computing a document similarity, which indicates similarity between documents, by a user pattern which indicates a plurality of users who have accessed one document and a user pattern which indicates a plurality of users who have accessed another document, according to the access history collected by the collection means; keyword weight vector correction means for correcting a keyword weight vector of the one document, using the document similarity computed by the document similarity computing means; and evaluation value calculation means for calculating an evaluation value for input information for searching, based on the keyword weight vector corrected by the keyword weight vector correction means. - View Dependent Claims (13)
-
-
14. A document processing program, comprising:
-
a collection module for collecting access history of a user; a document similarity computing module for computing a document similarity which indicates similarity between documents, by a user pattern which indicates a plurality of users have who accessed one document and a user pattern which indicates a plurality of users who have accessed another document, according to the access history collected by the collection module; a keyword weight vector correction module for correcting a keyword weight vector of the one document, using the document similarity computed by the document similarity computing module; and an evaluation value calculation module for calculating an evaluation value for input information for searching, based on the keyword weight vector corrected by the keyword weight vector correction module.
-
-
15. A document processing device, comprising:
-
primary WWW document extraction means for extracting WWW documents according to a searching word; user extraction means for extracting a user set of users who have accessed the WWW documents extracted by the primary WWW document extraction means; secondary WWW document extraction means for extracting a WWW document set of WWW documents accessed by the users extracted by the user extraction means; and significance calculation means for calculating significance of the WWW documents extracted by the primary WWW document extraction means based on a degree of accesses by users to the WWW document set extracted by the secondary WWW document extraction means. - View Dependent Claims (16)
-
-
17. A document processing device, comprising:
-
primary WWW document extraction means for extracting WWW documents according to a searching method; user extraction means for extracting a user set of users who accessed the WWW documents extracted by the primary WWW document extraction means; data structure holding means for holding data for which reference relationships among the WWW documents can be managed as a directed graph; secondary WWW document extraction means for extracting other WWW documents which each WWW document extracted by the primary WWW document extraction means refers to, and other WWW documents which refer to each WWW document, based on the data stored in the data structure holding means; and significance calculation means for calculating significance of the WWW documents extracted by the primary WWW document extraction means based on a degree of accesses by the users extracted by the user extraction means to the WWW document set extracted by the secondary WWW document extraction means.
-
-
18. A document processing device, comprising:
-
access history holding means for holding an access history to a WWW document by a plurality of users; data structure holding means for holding data for which reference relationships among WWW documents can be managed as a directed graph; primary WWW document extraction means for extracting WWW documents according to a searching word; user extraction means for extracting a user set of users who have accessed the WWW documents extracted by the primary WWW document extraction means from the access history holding means; secondary WWW document extraction means for extracting other WWW documents which each WWW document extracted by the primary WWW document extraction means refers to, and other WWW documents which refer to each of the WWW documents, based on the data stored in the data structure holding means, and extracting one node set by adding the user set extracted by the user extraction means and the WWW document set of the extracted WWW documents; and significance calculation means for calculating significance of the WWW documents by weighting a degree of being referred to among the WWW documents in the node set extracted by the secondary WWW document extraction means and a degree of accesses by each of the users to each of the WWW documents respectively.
-
-
19. A document processing device, comprising:
-
data structure holding means for holding data for which reference relationships among WWW documents can be managed as a directed graph; primary WWW document extraction means for extracting WWW documents according to a searching word; user extraction means for extracting a user set of users who have accessed the WWW documents extracted by the extraction means from the access history holding means; secondary WWW document extraction means for extracting other WWW documents which each WWW document extracted by the primary WWW document extraction means refers to, and other WWW documents which refer to each of the WWW documents, based on the data stored in the data structure holding means; hub score calculation means for calculating a hub score indicating a degree of accesses by each user of the user set extracted by the user extraction means to each WWW document extracted by the secondary WWW document extraction means; and significance calculation means for calculating significance based on a degree of matching of a visit vector of users who have visited a WWW document, included in any of the WWW documents and the hub score calculated by the hub score calculation means.
-
-
20. A document processing method, comprising:
-
a primary WWW document extraction step of extracting WWW documents according to a searching word; a user extraction step of extracting a user set of users who have accessed the WWW documents extracted in the primary WWW document extraction step; secondary WWW document extraction step of extracting a WWW document set of WWW documents accessed by the users extracted in the user extraction step; and significance calculation step of calculating significance of the WWW documents extracted in the primary WWW document extraction step based on a degree of accesses by the users to the WWW document set extracted in the secondary WWW document extraction step.
-
-
21. A document processing method for a document processing device having data structure holding means for holding data for which reference relationships among WWW documents can be managed as a directed graph, the method comprising:
-
a primary WWW document extraction step of extracting WWW documents according to a searching word; a user extraction step of extracting a user set of users who have accessed the WWW documents extracted in the primary WWW document extraction step; a secondary WWW document extraction step of extracting other WWW documents which each WWW document extracted in the primary WWW document extraction step refers to, and other WWW documents which refer to each WWW document, based on the data stored in the data structure holding means; and a significance calculation step of calculating significance of the WWW documents extracted in the primary WWW document extraction step based on a degree of accesses by the users extracted in the user extraction step to the WWW document set extracted in the secondary WWW document extraction step.
-
-
22. A document processing method for a document processing device having access history holding means for holding history of access to a WWW document by a plurality of users, and data structure holding means for holding data for which reference relationships among WWW documents can be managed as a directed graph, the method comprising:
-
a primary WWW document extraction step of extracting WWW documents according to a searching word; a user extraction step of extracting a user set of users who have accessed the WWW documents extracted in the primary WWW document extraction step from the access history holding means; a secondary WWW document extraction step of extracting other WWW documents which each WWW document extracted in the primary WWW document extraction step refers to, and other WWW documents which refer to each of the WWW documents, based on the data stored in the data structure holding means, and extracting one node set by adding the user set extracted in the user extraction step and the WWW document set of the extracted WWW documents; and significance calculation step of calculating significance of the WWW documents by weighting a degree of being referred to among the WWW documents in the node set extracted in the secondary WWW document extraction step and a degree of accesses by each of the users to each of the WWW documents respectively.
-
-
23. A document processing method for a document processing device having access history holding means for holding history of access to a WWW document by a plurality of users, and data structure holding means for holding data for which reference relationships among WWW documents can be managed as a directed graph,
the method comprising: -
a primary WWW document extraction step of extracting WWW documents according to a searching word; a user extraction step of extracting a user set of users who have accessed the WWW documents extracted in the primary WWW document extraction step from the access history holding means; a secondary WWW document extraction step of extracting other WWW documents which each WWW document extracted in the primary WWW document extraction step refers to, and other WWW documents which refer to each of the WWW documents, based on the data stored in the data structure holding means; a hub score calculation step of calculating a hub score which indicates a degree of accesses by each user of the user set extracted in the user extraction step to each WWW document extracted in the secondary WWW document extraction step; and a significance calculation step of calculating significance based on a degree of matching of a visit vector of users who have visited a WWW document included in any of the WWW documents and the hub score calculated in the hub score calculation step.
-
Specification