Automatic clustering method
First Claim
1. An automatic pattern recognition method to determine a category of input data points of unknown category, comprising:
- performing a first learning step of dividing sample data points into classes, which includesgenerating sample data points whose categories are known,generating a plurality of standard patterns in an n-dimensional space at arbitrary positions near a center of sample data points distribution, which standard patterns correspond to an arbitrary plurality of classes of the sample data points,calculating the distances between individual sample data points and each of the standard patterns to determine the nearest standard pattern for each sample data point,temporarily classifying each sample data point as belonging to the class corresponding to the nearest standard pattern,calculating the summation of differences between each standard pattern and the corresponding sample data points for each dimension for each class,moving the standard patterns in a direction represented by the summation of differences for each class,temporarily classifying each sample data point as belonging to the class corresponding to the nearest moved standard pattern,recalculating the distances between each sample data point and each moved standard pattern to determine the nearest standard pattern for each sample data point,moving the standard patterns in a direction represented by the summation of differences,repeating the preceding three steps until the summation of differences between each standard pattern and the corresponding sample data points is smaller in each dimension than a set specific value, anddetermining a final position of each standard pattern and sample data points belonging to the class represented by the corresponding standard pattern;
performing a second learning step, when the sample data points belonging to one class do not belong to the same category, which includesdividing the one class into a plurality of subclasses, andrepeating said step of dividing for the sample data points for each remaining class and subclass that has sample data points of more than one category;
performing a third learning step of relating the standard patterns, classes and subclasses obtained in the first learning step and second learning step to each other in a tree-structure representation and storing the tree-structure representation in memory;
inputting data points of unknown category; and
determining recognition/nonrecognition of the input data points of unknown category based on correspondence/lack of correspondence between the input data points and the stored tree-structure representation.
1 Assignment
0 Petitions
Accused Products
Abstract
An automatic pattern recognition method has a short processing time, can be applied to nonlinear separation problems, and can perform similarity calculations. The method: Divides a plurality of sample data of known categories into a plurality of classes; When the sample data in a divided class is not all in the same category, repeats dividing the sample data into subclasses until sample data in a subclass has only one category; Expresses the relationship between classes and subclasses in a tree-structure representation and determines the standard pattern for each class and subclass from the sample data contained there; and Checks which of the tree-structured classes input data of unknown category is nearest, by calculating the distance to the standard pattern of each class, and then, when the class has subclasses, performs a similar check until the lowest-level subclass is reached to determine the subclass the input data is closest to. The category of the lowest-level subclass is taken as the category of the input data.
-
Citations
6 Claims
-
1. An automatic pattern recognition method to determine a category of input data points of unknown category, comprising:
-
performing a first learning step of dividing sample data points into classes, which includes generating sample data points whose categories are known, generating a plurality of standard patterns in an n-dimensional space at arbitrary positions near a center of sample data points distribution, which standard patterns correspond to an arbitrary plurality of classes of the sample data points, calculating the distances between individual sample data points and each of the standard patterns to determine the nearest standard pattern for each sample data point, temporarily classifying each sample data point as belonging to the class corresponding to the nearest standard pattern, calculating the summation of differences between each standard pattern and the corresponding sample data points for each dimension for each class, moving the standard patterns in a direction represented by the summation of differences for each class, temporarily classifying each sample data point as belonging to the class corresponding to the nearest moved standard pattern, recalculating the distances between each sample data point and each moved standard pattern to determine the nearest standard pattern for each sample data point, moving the standard patterns in a direction represented by the summation of differences, repeating the preceding three steps until the summation of differences between each standard pattern and the corresponding sample data points is smaller in each dimension than a set specific value, and determining a final position of each standard pattern and sample data points belonging to the class represented by the corresponding standard pattern; performing a second learning step, when the sample data points belonging to one class do not belong to the same category, which includes dividing the one class into a plurality of subclasses, and repeating said step of dividing for the sample data points for each remaining class and subclass that has sample data points of more than one category; performing a third learning step of relating the standard patterns, classes and subclasses obtained in the first learning step and second learning step to each other in a tree-structure representation and storing the tree-structure representation in memory; inputting data points of unknown category; and determining recognition/nonrecognition of the input data points of unknown category based on correspondence/lack of correspondence between the input data points and the stored tree-structure representation. - View Dependent Claims (2, 3)
-
-
4. An automatic pattern recognition method to determine a category of input data points of unknown category, comprising:
-
performing a first learning step of dividing sample data into classes, which includes generating sample data points whose categories are known, generating a plurality of standard patterns in an n-dimensional space at arbitrary positions near a center of sample data points distribution, which standard patterns correspond to an arbitrary plurality of classes of the sample data points, calculating the distances between individual sample data points and each of the standard patterns to determine the nearest standard pattern for each sample data point, temporarily classifying each sample data point as belonging to the class corresponding to the nearest standard pattern, adjusting the positions of the standard patterns if the difference between each standard pattern and center of distribution of the corresponding sample data points is not smaller in each dimension than a set specific value; repeating the preceding three steps until the difference between each standard pattern and center of distribution of the corresponding sample data points is smaller in each dimension than a set specific value, and determining a final position of each standard pattern and sample data points belonging to the class represented by the corresponding standard pattern; performing a second learning step, when the sample data points belonging to one class do not belong to the same category, which includes dividing the one class into a plurality of subclasses, and repeating said step of dividing for the sample data points of each remaining class and subclass that has sample data points of more than one category; performing a third learning step of relating the standard patterns, classes and subclasses obtained in the first learning step and second learning step to each other in a tree-structure representation and storing the tree-structure representation in memory; inputting data points of unknown category; and determining recognition/nonrecognition of the input data points of unknown category based on correspondence/lack of correspondence between the input data points and the stored tree-structure representation. - View Dependent Claims (5, 6)
-
Specification