Data mining method and system using regression clustering
First Claim
Patent Images
1. A processor-based method, comprising:
- selecting a set number of functions correlating variable parameters of a dataset; and
clustering the dataset by iteratively applying a regression algorithm and a K-Harmonic Means performance function on the set number of functions.
8 Assignments
0 Petitions
Accused Products
Abstract
A method and a system are provided which regressively cluster datapoints from a plurality of data sources without transferring data between the plurality of data sources. In addition, a method and a system are provided which mine data from a dataset by iteratively applying a regression algorithm and a K-Harmonic Means performance function on a set number of functions derived from the dataset.
21 Citations
30 Claims
-
1. A processor-based method, comprising:
-
selecting a set number of functions correlating variable parameters of a dataset; and
clustering the dataset by iteratively applying a regression algorithm and a K-Harmonic Means performance function on the set number of functions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A storage medium comprising program instructions executable by a processor for:
-
selecting a set number of functions correlating variable parameters of a dataset;
determining distances between datapoints of the dataset and values correlated with the set number of functions;
calculating harmonic averages of the distances;
regressing the set number of functions using datapoint probability and weighting factors associated with the determined distances;
repeating said determining and calculating for the regressed set of functions;
computing a change in harmonic averages for the set number of functions prior to and subsequent to said regressing; and
reiterating said regressing, repeating and computing upon determining the change in harmonic averages is greater than a predetermined value. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
an input port configured to receive data; and
a processor configured to;
regress functions correlating variable parameters of a set of the data;
cluster the functions using a K-Harmonic Mean performance function; and
repeat said regress and cluster sequentially. - View Dependent Claims (16, 17)
-
-
18. A system, comprising:
-
a plurality of data sources; and
a means for regressively clustering datapoints from the plurality of data sources without transferring data between the plurality of data sources. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A system, comprising:
-
a plurality of data sources each having a processor configured to access datapoints within the respective data source; and
a central station coupled to the plurality of data sources and comprising a processor, wherein the processors of the central station and plurality of data sources are collectively configured to mine the datapoints of the data sources as a whole without transferring all of the datapoints between the data sources and the central station. - View Dependent Claims (25, 26, 27)
-
-
28. A processor-based method for mining data, comprising:
-
independently applying a regression clustering algorithm to a plurality of distributed datasets;
developing matrices from probability and weighting factors computed from the regression clustering algorithm, wherein the matrices individually represent the distributed datasets without including all datapoints within the datasets;
determining global coefficient vectors from a composite of the matrices; and
multiplying functions correlating similar variable parameters of the distributed datasets by the global coefficient vectors. - View Dependent Claims (29, 30)
-
Specification