Method and apparatus for finding a set of documents relevant to a focus set using citation analysis and spreading activation techniques
First Claim
1. A method for predicting documents relevant to a focus document within a collection of linked documents comprising the steps of:
- a) generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents;
b) generating citation information from the documents in said collection of linked documents, said citation information indicating a strength of association between documents in said collection of linked documents;
c) generating link probability information from usage data for said collection of linked documents, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents;
d) performing a spreading activation operation based on said initial activation information, citation information and said probability information based on a network representation of said collection of linked documents; and
e) extracting said document relevance information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached.
8 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus combining spreading activation and citation analysis techniques to find related documents in collections of linked documents. Spreading activation is an analysis technique that may be used to find documents relevant to a set of focus documents. Citation analysis is an analysis technique used to indicate a reference or link relationship amongst the documents in a collection of linked documents. The results of citation analysis are used in spreading activation as an indicator of the strength of association amongst the documents in the document collection. When spreading activation is performed an indication of documents relevant to the set of focus documents, based on how documents or referenced or linked, is obtained.
-
Citations
7 Claims
-
1. A method for predicting documents relevant to a focus document within a collection of linked documents comprising the steps of:
-
a) generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents;
b) generating citation information from the documents in said collection of linked documents, said citation information indicating a strength of association between documents in said collection of linked documents;
c) generating link probability information from usage data for said collection of linked documents, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents;
d) performing a spreading activation operation based on said initial activation information, citation information and said probability information based on a network representation of said collection of linked documents; and
e) extracting said document relevance information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached. - View Dependent Claims (2, 3)
-
-
4. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps predicting documents relevant to a focus document within a collection of linked documents, said method comprising the steps of:
-
a) generating initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents;
b) generating citation information from the documents in said collection of linked documents, said citation information indicating a strength of association between documents in said collection of linked documents;
c) generating link probability information from usage data for said collection of linked documents, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents;
d) performing a spreading activation operation based on said initial activation information, citation information and said probability information based on a network representation of said collection of linked documents; and
f) extracting said document relevance information resulting from said spreading activation step when a stable pattern of activation across all nodes of said network representation of said collection of linked documents is reached.
-
-
5. A system for predicting documents relevant to a focus document in a linked collection of documents, said system comprising:
-
means for obtaining initial activation information, said initial activation information indicating a set of focus documents in said collection of linked documents;
means for obtaining raw data for said linked collection of documents, said raw data including citation data amongst said documents in said linked collection of documents;
means for creating a citation map for said linked collection of documents from said raw data;
means for generating link probability information from usage data for said collection of linked documents, said link probability information indicating a distribution of the number of documents a user will access in said collection of linked documents; and
means for predicting a set of documents relevant to said focus document using a spreading activation technique on a network representation of said collection of linked documents, said citation map and said link probability information. - View Dependent Claims (6, 7)
-
Specification