METHOD AND APPARATUS FOR CALCULATING TOPICAL CATEGORIZATION OF ELECTRONIC DOCUMENTS IN A COLLECTION
First Claim
Patent Images
1. A computer implemented method for calculating topical categorization of electronic documents in a collection, comprising:
- processor application of a metric to categorize semantic distance between two sections of a document or between two documents;
said processor executing a topic algorithm using the categorization provided by said metric to automatically determine topic boundaries;
said processor extracting topics based upon said topic boundaries; and
said processor comparing said extracted topics for similarity with topics in other documents for organizational and research purposes.
4 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method calculates topical categorization of electronic documents in a collection. A processor applies a metric to categorize semantic distance between two sections of a document or between two documents. The processor executes a topic algorithm using the categorization provided by the metric to determine topic boundaries. Topics are extracted based upon the topic boundaries; and the extracted topics are compared for similarity with topics in other documents for organizational and research purposes.
-
Citations
28 Claims
-
1. A computer implemented method for calculating topical categorization of electronic documents in a collection, comprising:
-
processor application of a metric to categorize semantic distance between two sections of a document or between two documents; said processor executing a topic algorithm using the categorization provided by said metric to automatically determine topic boundaries; said processor extracting topics based upon said topic boundaries; and said processor comparing said extracted topics for similarity with topics in other documents for organizational and research purposes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. An apparatus for calculating topical categorization of electronic documents in a collection, comprising:
-
a processor configured for applying a metric to categorize semantic distance between two sections of a document or between two documents; said processor configured for executing a topic algorithm using the categorization provided by said metric to automatically determine topic boundaries; said processor configured for extracting topics based upon said topic boundaries; and said processor configured for comparing said extracted topics for similarity with topics in other documents for organizational and research purposes. - View Dependent Claims (26, 27, 28)
-
Specification