Method and apparatus for predicting document access in a collection of linked documents featuring link proprabilities and spreading activation
First Claim
1. A method for predicting document access within a collection of linked documents comprising the steps of:
- a) gathering usage data for said collection of linked documents;
b) generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents;
c) generating page to page transition information from said usage data, said page to page transition information indicating a strength of association between documents in said collection of linked documents;
d) generating link probability information from said usage data, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents;
e) performing a spreading activation operation based on said initial activation information, page to page transition information and said probability information based on a network representation of said collection of linked documents; and
f) extracting said document access information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for predicting document access within a collection of linked documents. The present invention utilizes a predictive technique known as "spreading activation" where document collections are graphically represented as a network. Empirical data is analyzed according to a law of surfing to generate a decay function which is used to dampen the activation as spreads through the network. Activation is applied to a set of focus documents and propagates through the network until a stable pattern of activation is achieved across all documents. From this stable pattern, the desired usage information is extracted. Such a system will provide several practical benefits to users of the World Wide Web. For example, the present invention can be used to identify relevant pages to a set of one or more focus pages or to predict the number of times a document will be accessed in a collection of linked documents. Further, alone or in combination, this information can be used in connection with web site design or re-design.
219 Citations
8 Claims
-
1. A method for predicting document access within a collection of linked documents comprising the steps of:
-
a) gathering usage data for said collection of linked documents; b) generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents; c) generating page to page transition information from said usage data, said page to page transition information indicating a strength of association between documents in said collection of linked documents; d) generating link probability information from said usage data, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents; e) performing a spreading activation operation based on said initial activation information, page to page transition information and said probability information based on a network representation of said collection of linked documents; and f) extracting said document access information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached. - View Dependent Claims (2, 3)
-
-
4. A method for modeling document access in a collection of linked documents based on prior usage information of said collection, said method comprising the steps of:
-
a) generating an initial activation matrix, said initial activation matrix indicating a set of focus documents in said collection of linked documents; b) generating a transition matrix using said prior usage information, said transition matrix indicating user traversal information between documents in said collection of link ed documents; c) generating a link probability vector from said prior usage information, said link probability vector indicating probabilities that a user will link to another document in said document collection; d) performing a spreading activation operation using said initial activation matrix, transition matrix and link probability vector to obtain a spreading activation result; and e) extracting document access information from said spreading activation result. - View Dependent Claims (5, 6)
-
-
7. A system for predicting document access within a collection of linked documents comprising:
-
means for gathering usage data for said collection of linked documents; means for generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents; means for generating page to page transition information from said usage data, said page to page transition information indicating a strength of association between documents in said collection of linked documents; means for generating link probability information from said usage data, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents; spreading activation means for performing a spreading activation operation based on said initial activation information, page to page transition information and said probability information based on a network representation of said collection of linked documents; and information extraction means for extracting said document access information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached. - View Dependent Claims (8)
-
Specification