Heuristic method of classification
First Claim
1. A method for creating a model for classifying a biological sample as being of a first state or a second state different than the first state, comprising:
- obtaining a data string derived from each biological sample of a set known to be of the first state and a set known to be of the second state;
selecting data elements from each data string using an evolutionary algorithm;
determining the locations of a first set of vectors and a second set of vectors in a vector space, each vector of the first set of vectors corresponding to data elements derived from a biological sample known to be of the first state, each vector of the second set of vectors corresponding to data elements derived from a biological sample known to be of the second state; and
identifying a model acceptable for classifying biological samples containing at least one cluster disposed within the vector space, the cluster containing at least one of the vectors of the first set of vectors and being associated with the first state for purposes of classifying a biological sample.
2 Assignments
0 Petitions
Accused Products
Abstract
The invention concerns heuristic algorithms for the classification of Objects. A first learning algorithm comprises a genetic algorithm that is used to abstract a data stream associated with each Object and a pattern recognition algorithm that is used to classify the Objects and measure the fitness of the chromosomes of the genetic algorithm. The learning algorithm is applied to a training data set. The learning algorithm generates a classifying algorithm, which is used to classify or categorize unknown Objects. The invention is useful in the areas of classifying texts and medical samples, predicting the behavior of one financial market based on price changes in others and in monitoring the state of complex process facilities to detect impending failures.
83 Citations
32 Claims
-
1. A method for creating a model for classifying a biological sample as being of a first state or a second state different than the first state, comprising:
-
obtaining a data string derived from each biological sample of a set known to be of the first state and a set known to be of the second state; selecting data elements from each data string using an evolutionary algorithm; determining the locations of a first set of vectors and a second set of vectors in a vector space, each vector of the first set of vectors corresponding to data elements derived from a biological sample known to be of the first state, each vector of the second set of vectors corresponding to data elements derived from a biological sample known to be of the second state; and identifying a model acceptable for classifying biological samples containing at least one cluster disposed within the vector space, the cluster containing at least one of the vectors of the first set of vectors and being associated with the first state for purposes of classifying a biological sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of creating a classifying pattern for objects using a plurality of data strings, each data string associated with one of a plurality of objects to be classified, comprising:
-
selecting a set of data elements from each data string using a learning algorithm; classifying the set of data elements using a classifying algorithm; and repeating the selecting and classifying with a different set of data elements selected from each data string until a classifying pattern is created that is acceptable to classify the objects. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method of constructing a model configured to classify objects as being of one of at least a first state and a second state different than the first state, comprising:
-
receiving a plurality of data strings, each data string being derived from an object known to be of the first state or the second state; selecting a first set of variables that correspond with data in each of the plurality of data strings; calculating a vector for each of the plurality of data strings using the first set of variables; finding a location in a first vector space of each of at least two data clusters that best fit the vectors calculated using the first set of variables; providing the locations in the first vector space of the at least two data clusters; determining a variability for the at least two data clusters that best fit the vectors calculated using the first set of variables; determining whether the variability of the at least two data clusters that best fit the vectors calculated using the first set of variables is within an acceptable tolerance; if it is determined that the variability of the at least two data clusters that best fit the vectors calculated using the first set of variables is not within the acceptable tolerance, using an evolutionary algorithm to select a second set of variables different than the first set of variables, calculating a vector for each of the plurality of data strings using the second set of variables, finding a location in a second vector space of each of at least two data clusters that best fit the vectors calculated using the second set of variables, determining a variability for the at least two data clusters that best fit the vectors calculated using the second set of variables, determining whether the variability for the at least two data clusters that best fit the vectors calculated using the second set of variables is within the acceptable tolerance, and if it is determined that the variability of the at least two data clusters that best fit the vectors calculated using the second set of variables is within the acceptable tolerance, providing the locations in the second vector space of the at least two data clusters that best fit the vectors calculated using the second set of variables. - View Dependent Claims (30, 31, 32)
-
Specification