Clustering system for optical character reader
First Claim
1. A clustering system for classifying data into a predetermined number of classes, comprising:
- first memory means for storing feature vectors of said data;
second memory means for storing said predetermined number and representative vectors of said classes, said representative vectors being previously provided;
cosine calculating means for calculating the cosine of each of said feature vectors stored in said first memory means and each of said representative vectors stored in said second memory means;
classification means for assigning each of said feature vectors into one of said classes which indicates the largest cosine value;
third memory means for storing said largest cosine value for each of said feature vectors;
total sum vector calculating means for calculating, for each of said classes, a total sum vector from feature vectors of one of said classes, using a weight which is the largest cosine value for each of said feature vectors stored in said third memory means, and for storing said weighted total sum vector into said second means, thereby updating the contents of said second memory means; and
convergence judging means for controlling said cosine calculating means, classification means and total sum vector calculating means to operate repeatedly, until no feature vectors are exchanged between the classes as a result of the classification by said classification means.
1 Assignment
0 Petitions
Accused Products
Abstract
In a clustering system for an optical character reader, feature vectors of character image data to be classified are stored in a first memory. The number of classes, and representative vectors for the respective classes which are previously provided are stored in a second memory. Values of the cosine of each of the feature vectors stored in the first memory and each of the representative vectors stored in the second memory are calculated by a cosine calculating unit. Then, each of the feature vectors is assigned to one of the classes from which the largest cosine value is calculated, and the largest cosine value is stored into a third memory. For each of the thus processed classes, a total sum vector calculating unit calculates a total sum vector from the feature vectors which belong to the respective class, using the largest cosine values for the respective feature vectors stored in the third memory as weights. The resulting weighted total sum vector is stored into the second memory as a new representative vector for the class so as to update the contents of the second memory. The cosine calculating unit, and total sum vector calculating unit are controlled to operate repeatedly, until a convergence judging unit judges that any feature vectors are not exchanged between the classes during the classification.
-
Citations
4 Claims
-
1. A clustering system for classifying data into a predetermined number of classes, comprising:
-
first memory means for storing feature vectors of said data; second memory means for storing said predetermined number and representative vectors of said classes, said representative vectors being previously provided; cosine calculating means for calculating the cosine of each of said feature vectors stored in said first memory means and each of said representative vectors stored in said second memory means; classification means for assigning each of said feature vectors into one of said classes which indicates the largest cosine value; third memory means for storing said largest cosine value for each of said feature vectors; total sum vector calculating means for calculating, for each of said classes, a total sum vector from feature vectors of one of said classes, using a weight which is the largest cosine value for each of said feature vectors stored in said third memory means, and for storing said weighted total sum vector into said second means, thereby updating the contents of said second memory means; and convergence judging means for controlling said cosine calculating means, classification means and total sum vector calculating means to operate repeatedly, until no feature vectors are exchanged between the classes as a result of the classification by said classification means.
-
-
2. An optical character reader comprising a clustering system for classifying character image data into a predetermined number of font classes, said clustering system comprising:
-
first memory means for storing feature vectors of said character image data; second memory means for storing said predetermined number and representative vectors of said font classes, said representative vectors being previously provided; cosine calculating means for calculating the cosine of each of said feature vectors stored in said first memory means and each of said representative vectors stored in said second memory means; classification means for assigning each of said feature vectors into one of said classes which indicates the largest cosine value; third memory means for storing said largest cosine value for each of said feature vectors; total sum vector calculating means for calculating, for each of said classes, a total sum vector from feature vectors of one of said classes, using a weight which is the largest cosine value for each of said feature vectors stored in said third memory means, and for storing said weighted total sum vector into said second means, thereby updating the contents of said second memory means; and convergence judging means for controlling said cosine calculating means, classification means and total sum vector calculating means to operate repeatedly, until no feature vectors are exchanged between the classes as a result of the classification by said classification means. - View Dependent Claims (3, 4)
-
Specification