ORGANISING AND STORING DOCUMENTS
First Claim
1. A data handling device for organising and storing documents for subsequent retrieval, the documents having associated metadata terms, the device comprising:
- means providing access to a store of existing metadata;
means operable to analyse the existing metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document;
means for analysing a fresh document to assign to it a set of terms and determine for each a measure of their strength of association with the document;
means operable to determine for each term of the set a score that is a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence, in the existing metadata, of that term and another term that occurs in the set;
means operable to select, as metadata for the fresh document, a subset of the terms in the set having the highest scores.
1 Assignment
0 Petitions
Accused Products
Abstract
A data handling device has access to a store of existing metadata pertaining to existing documents having associated metadata terms. It analyses the metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document. When a fresh document is received, it is analysed to assign to it a set of terms and determine for each a measure of their strength of association with the document. Then, for each term of the set, a score is generated that is a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence of that term and another term that occurs in the set; metadata for the fresh document are then selected as the subset of the terms in the set having the highest scores.
13 Citations
18 Claims
-
1. A data handling device for organising and storing documents for subsequent retrieval, the documents having associated metadata terms, the device comprising:
-
means providing access to a store of existing metadata; means operable to analyse the existing metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document; means for analysing a fresh document to assign to it a set of terms and determine for each a measure of their strength of association with the document; means operable to determine for each term of the set a score that is a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence, in the existing metadata, of that term and another term that occurs in the set; means operable to select, as metadata for the fresh document, a subset of the terms in the set having the highest scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of organising and storing documents for subsequent retrieval, the documents having associated metadata terms, the method comprising:
-
providing access to a store of existing metadata; analysing the existing metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document; analysing a fresh document to assign to it a set of terms and determine for each a measure of their strength of association with the document; determining for each term of the set a score that is a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence, in the existing metadata, of that term and another term that occurs in the set; and
selecting, as metadata for the fresh document, a subset of the terms in the set having the highest scores. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification