ORGANISING AND STORING DOCUMENTS
First Claim
1. A method of organising documents, the documents having associated metadata terms, the method comprising:
- providing access to a store of existing metadata;
selecting from the existing metadata items assigned to documents deemed to be of interest to a user and generating for each of one of more terms occurring in the selected metadata values indicative of the frequency of co-occurrence of that term with a respective other term in the metadata of one and the same document;
analysing a fresh document to assign to it a set of terms and determine for each a measure (nj) of their strength of association with the document; and
determining, for the fresh document, for each term (h) of the set a score that is a monotonically increasing function of a) the strength of association (nj) with the document and of b) the relative frequency of co-occurrence (vhj), in the selected existing metadata, of that term and another term (J) that occurs in the set.
1 Assignment
0 Petitions
Accused Products
Abstract
A data handling device has access to a store of existing metadata pertaining to existing documents having associated metadata terms. It selects metadata assigned to documents deemed to be of interest to a user and analyses the metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document. When a fresh document is received, it is analysed to assign to it a set of terms and determine for each a measure of their strength of association with the document. Then, a score is generated for the document, for each term of the set, the score being a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence of that term and another term that occurs in the set. The score represents the relevance of the document to the users and can be used (following comparison with a threshold, or with the scores of other such documents) to determine whether the document is to be reported to the user, and/or retrieved.
-
Citations
12 Claims
-
1. A method of organising documents, the documents having associated metadata terms, the method comprising:
-
providing access to a store of existing metadata; selecting from the existing metadata items assigned to documents deemed to be of interest to a user and generating for each of one of more terms occurring in the selected metadata values indicative of the frequency of co-occurrence of that term with a respective other term in the metadata of one and the same document; analysing a fresh document to assign to it a set of terms and determine for each a measure (nj) of their strength of association with the document; and determining, for the fresh document, for each term (h) of the set a score that is a monotonically increasing function of a) the strength of association (nj) with the document and of b) the relative frequency of co-occurrence (vhj), in the selected existing metadata, of that term and another term (J) that occurs in the set. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A data handling device for organising documents, the documents having associated metadata terms, the device comprising:
-
means providing access to a store of existing metadata; means operable to select from the existing metadata items assigned to documents deemed to be of interest to a user and to generate for each of one of more terms occurring in the selected metadata values indicative of the frequency of co-occurrence of that term with a respective other term in the metadata of one and the same document; means for analysing a fresh document to assign to it a set of terms and determine for each a measure (nj) of their strength of association with the document; and means operable to determine, for the fresh document, for each term (h) of the set a score that is a monotonically increasing function of a) the strength of association (nj) with the document and of b) the relative frequency of co-occurrence (vhj), in the selected existing metadata, of that term and another term (j) that occurs in the set. - View Dependent Claims (9, 10, 11, 12)
-
Specification