Theme-based system and method for classifying documents
First Claim
Patent Images
1. A method for classifying a document using a classification system comprising the steps of:
- defining a plurality of classes;
identifying source documents of each of said plurality of classes;
generating a classification theme score for each of said classes from the source documents for each of the plurality of classes;
entering an unclassified document into the system;
generating an unclassified document theme score corresponding to said unclassified document;
classifying the unclassified document into one of said plurality of classes when the unclassified document theme score is substantially equal to the classification theme score;
reclassifying a plurality of classes into a plurality of new classes by;
identifying source documents for each of said plurality of new classes;
generating a respective plurality of new class theme scores for each of said plurality of new classes;
reclassifying documents within said plurality of classes into the plurality of new classes when a classified document theme score is substantially similar to one of the respective new class theme scores;
storing the reclassified documents in a document storage memory.
3 Assignments
0 Petitions
Accused Products
Abstract
A classification system (10) having a controller (12), a document storage memory (14), and a document input (16) is used to classify documents (20). The controller (12) is programmed to generate a theme score from a plurality of source documents in a plurality of predefined source documents. A theme score is also generated for the unclassified document. The unclassified document theme score and the theme scores for the various classes are compared and the unclassified document is classified into the classification having the nearest theme score.
145 Citations
7 Claims
-
1. A method for classifying a document using a classification system comprising the steps of:
-
defining a plurality of classes; identifying source documents of each of said plurality of classes; generating a classification theme score for each of said classes from the source documents for each of the plurality of classes; entering an unclassified document into the system; generating an unclassified document theme score corresponding to said unclassified document; classifying the unclassified document into one of said plurality of classes when the unclassified document theme score is substantially equal to the classification theme score; reclassifying a plurality of classes into a plurality of new classes by; identifying source documents for each of said plurality of new classes; generating a respective plurality of new class theme scores for each of said plurality of new classes; reclassifying documents within said plurality of classes into the plurality of new classes when a classified document theme score is substantially similar to one of the respective new class theme scores; storing the reclassified documents in a document storage memory.
-
-
2. A method for classifying a document using a classification system comprising the steps of:
-
establishing a plurality of classes and a plurality of subclasses; identifying source documents of each of said plurality of classes and said plurality of subclasses; generating a classification theme score for each class and each subclass in response to the source documents; entering an unclassified document into the system; generating an unclassified theme score for the unclassified document; and classifying the document into one of said plurality of classes when the unclassified document theme score is substantially equal to the classification theme score; reclassifying a plurality of classes into a plurality of new classes by; identifying source documents for each of said plurality of new classes; generating respective new class theme scores; reclassifying documents with said plurality of classes into the plurality of new classes when the classified document theme score is equal to one of the respective new class theme scores; storing the reclassified documents in a document storage memory.
-
-
3. A system for classifying documents comprising:
-
a document input for entering an unclassified document; a document storage memory; a controller coupled to said document input and document storage memory said controller programmed to classify documents into a plurality of classes by identifying source documents of each of said plurality of classes, generating a classification theme score categorizing documents into said classes, generating an unclassified theme score for the unclassified document, and classifying the document into one of said plurality of classes when the unclassified document theme score is substantially similar to the classification theme score, said controller further programmed to identify source documents for each of a plurality of new classes;
generate respective new classification theme scores;
reclassify documents with said plurality of classes into the plurality of new classes when the classified document theme score is substantially similar to the new class theme score; and
store the reclassified documents in the document storage memory. - View Dependent Claims (4, 5, 6, 7)
-
Specification