Methods of identifying biological patterns using multiple data sets
First Claim
1. A method for enhancing knowledge discovery using multiple support vector machines comprising:
- pre-processing a first training biological data set and a second training biological data set in order to add dimensionality to each of a plurality of training biological data points;
training one or more first support vector machines using the first pre-processed training biological data set, each of the first support vector machines comprising different kernels;
training one or more second support vector machines using the second pre-processed training data set, each of the second support vector machines comprising different kernels;
pre-processing a first test biological data set in the same manner as was the first training biological data sets and pre-processing a second test biological data set in the same manner as was the second training biological data set;
testing each of the first trained support vector machines using the first pre-processed test biological data set and testing each of the second trained support vector machines using the second pre-processed test biological data set;
in response to receiving a first test output from each of the first trained support vector machines, comparing each of the first test outputs with each other to determine which if any of the first test outputs is a first optimal solution;
in response to receiving a second test output from each of the second trained support vector machines, comparing each of the second test outputs with each other to determine which if any of the second test outputs is a second optimal solution;
combining the first optimal solution with the second optimal solution to create a new input data set to be input into one or more additional support vector machines.
7 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for enhancing knowledge discovery from data using multiple learning machines in general and multiple support vector machines in particular. Training data for a learning machine is pre-processed in order to add meaning thereto. Multiple support vector machines, each comprising distinct kernels, are trained with the pre-processed training data and are tested with test data that is pre-processed in the same manner. The test outputs from multiple support vector machines are compared in order to determine which of the test outputs if any represents a optimal solution. Selection of one or more kernels may be adjusted and one or more support vector machines may be retrained and retested. Optimal solutions based on distinct input data sets may be combined to form a new input data set to be input into one or more additional support vector machine. The methods, systems and devices of the present invention comprise use of Support Vector Machines for the identification of patterns that are important for medical diagnosis, prognosis and treatment. Such patterns may be found in many different datasets. The present invention also comprises methods and compositions for the treatment and diagnosis of medical conditions.
133 Citations
17 Claims
-
1. A method for enhancing knowledge discovery using multiple support vector machines comprising:
-
pre-processing a first training biological data set and a second training biological data set in order to add dimensionality to each of a plurality of training biological data points;
training one or more first support vector machines using the first pre-processed training biological data set, each of the first support vector machines comprising different kernels;
training one or more second support vector machines using the second pre-processed training data set, each of the second support vector machines comprising different kernels;
pre-processing a first test biological data set in the same manner as was the first training biological data sets and pre-processing a second test biological data set in the same manner as was the second training biological data set;
testing each of the first trained support vector machines using the first pre-processed test biological data set and testing each of the second trained support vector machines using the second pre-processed test biological data set;
in response to receiving a first test output from each of the first trained support vector machines, comparing each of the first test outputs with each other to determine which if any of the first test outputs is a first optimal solution;
in response to receiving a second test output from each of the second trained support vector machines, comparing each of the second test outputs with each other to determine which if any of the second test outputs is a second optimal solution;
combining the first optimal solution with the second optimal solution to create a new input data set to be input into one or more additional support vector machines. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-readable medium with computer-executable instructions for performing a method for:
-
pre-processing a first training biological data set and a second training biological data set in order to add dimensionality to each of a plurality of training biological data points;
training one or more first support vector machines using the first pre-processed training biological data set, each of the first support vector machines comprising different kernels;
training one or more second support vector machines using the second pre-processed training data set, each of the second support vector machines comprising different kernels;
pre-processing a first test biological data set in the same manner as was the first training biological data sets and pre-processing a second test biological data set in the same manner as was the second training biological data set;
testing each of the first trained support vector machines using the first pre-processed test biological data set and testing each of the second trained support vector machines using the second pre-processed test biological data set;
in response to receiving a first test output from each of the first trained support vector machines, comparing each of the first test outputs with each other to determine which if any of the first test outputs is a first optimal solution;
in response to receiving a second test output from each of the second trained support vector machines, comparing each of the second test outputs with each other to determine which if any of the second test outputs is a second optimal solution;
combining the first optimal solution with the second optimal solution to create a new input data set to be input into one or more additional support vector machines. - View Dependent Claims (14, 15, 16, 17)
-
Specification