Method and apparatus of metadata generation
First Claim
1. A method of generating metadata comprising the steps of:
- providing a plurality of source texts;
processing the plurality of source texts to extract primary metadata in the form of a plurality of sets of words;
comparing a source text with each of the sets of words to obtain a measure of the extent to which the source text is representative of a set of words.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of generating metadata is provided including providing (401) a plurality of source texts (100), processing the plurality of source texts (100) to extract primary metadata in the form of a plurality of sets of words (104, 106), and comparing (407) each of the source texts (100) with each of the sets of words (104, 106). The method includes using a clustering program to extract the sets of words (104, 106) from the source texts (100). The step of comparing is carried out by Latent Semantic Analysis to compare the similarity of meaning of each source text (100) with each set of words (104, 106) obtained by the clustering program. The comparison obtains a measure of the extent to which each source text (100) is representative of a set of words (104, 106).
140 Citations
17 Claims
-
1. A method of generating metadata comprising the steps of:
-
providing a plurality of source texts;
processing the plurality of source texts to extract primary metadata in the form of a plurality of sets of words;
comparing a source text with each of the sets of words to obtain a measure of the extent to which the source text is representative of a set of words. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16)
-
-
15. An apparatus for generating metadata comprising:
-
means for providing a plurality of source texts;
means for processing the source texts to extract primary metadata in the form of a plurality of sets of words;
means for comparing a source text with each of the sets of words to obtain a measure of the extent to which the source text is representative of a set of words.
-
-
17. A computer program product stored on a computer readable storage medium, comprising computer readable program code means for performing the steps of:
-
providing a plurality of source texts;
processing the plurality of source texts to extract primary metadata in the form of a plurality of sets of words;
comparing a source text with each of the sets of words to obtain a measure of the extent to which the source text is representative of a set of words.
-
Specification