DATA MINING PLATFORM FOR BIOINFORMATICS AND OTHER KNOWLEDGE DISCOVERY
First Claim
1. A computer-implemented data mining platform for generating an output comprising knowledge from analysis of a plurality of heterogeneous data types, the platform comprising:
- a plurality of modules, each module adapted for processing one data type of the plurality of heterogeneous data types, each module comprising an input data source, a data analysis engine, a data output and a web server connection for each of the input data source, the data analysis engine and the data output;
a web server connected to the web server connection for communicating with each of the input data source, the data analysis engine and the data output and for providing means for monitoring one or more of the input data source, the data analysis engine, and the data output; and
a combined data analysis engine in communication with the web server for combining the data output from the plurality of modules to generate a single output representing results obtained from analyzing the plurality of heterogeneous data types.
3 Assignments
0 Petitions
Accused Products
Abstract
The data mining platform comprises a plurality of system modules, each formed from a plurality of components. Each module has an input data component, a data analysis engine for processing the input data, an output data component for outputting the results of the data analysis, and a web server to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality. In the preferred embodiment, the data analysis engine is a kernel-based learning machine, and in particular, one or more support vector machines (SVMs). The data analysis engine includes a pre-processing function for feature selection, for reducing the amount of data to be processed by selecting the optimum number of attributes, or “features”, relevant to the information to be discovered.
86 Citations
9 Claims
-
1. A computer-implemented data mining platform for generating an output comprising knowledge from analysis of a plurality of heterogeneous data types, the platform comprising:
-
a plurality of modules, each module adapted for processing one data type of the plurality of heterogeneous data types, each module comprising an input data source, a data analysis engine, a data output and a web server connection for each of the input data source, the data analysis engine and the data output;
a web server connected to the web server connection for communicating with each of the input data source, the data analysis engine and the data output and for providing means for monitoring one or more of the input data source, the data analysis engine, and the data output; and
a combined data analysis engine in communication with the web server for combining the data output from the plurality of modules to generate a single output representing results obtained from analyzing the plurality of heterogeneous data types. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification