APPARATUS FOR CLASSIFYING OR DISAMBIGUATING DATA
First Claim
1. A computer processing apparatus for classifying a document, comprising:
- means for accessing a database structure providing a plurality of different subject matter categories, the database containing a classified vocabulary consisting of terms in all of the different subject matter categories with each term being classified in accordance with the subject matter category structure of the database;
means for receiving in computer-readable form a text documents to be classified;
processor means operable to compare terms appearing in the text document with the terms in the classified vocabulary and to determine from the comparison the category for the document; and
means for supplying a signal carrying data representing the text document and data associating the text document with the determined category.
2 Assignments
0 Petitions
Accused Products
Abstract
A computing system has a data storage device (4, 5, 6) for storing a database consisting of a classified vocabulary of terms. A processor (1) of the apparatus is arranged to associate each term with one of a number of different categories of data and to associate all terms falling within the same category with a common code identifying a collocation of terms that exemplify that category so that terms in different categories are associated with different codes and can be disambiguated. The processor (1) is arranged to write, directly or indirectly, a classified vocabulary consisting of the terms together with the associated code onto a computer-readable storage medium (RDD2) or to supply an electrical signal via, for example a MODEM (10) or a LAN/WAN (11). The database may be used in classification of documents, spelling checking of documents and refining of keyword search results.
-
Citations
1 Claim
-
1. A computer processing apparatus for classifying a document, comprising:
-
means for accessing a database structure providing a plurality of different subject matter categories, the database containing a classified vocabulary consisting of terms in all of the different subject matter categories with each term being classified in accordance with the subject matter category structure of the database; means for receiving in computer-readable form a text documents to be classified; processor means operable to compare terms appearing in the text document with the terms in the classified vocabulary and to determine from the comparison the category for the document; and means for supplying a signal carrying data representing the text document and data associating the text document with the determined category.
-
Specification