Computer-Implemented Systems And Methods For Variable Clustering In Large Data Sets
First Claim
1. A computer-implemented method for creating a cluster structure from a data set containing a plurality of input variables, comprising:
- creating global clusters, within a first stage, by computing a similarity matrix from the data set;
creating, within a second stage, both a global cluster structure and a sub-cluster structure;
wherein the global cluster structure and the sub-cluster structure are created using a latent variable clustering technique; and
forming, as output to a computer-readable data store, the cluster structure by combining the created global cluster structure and the created sub-cluster structure.
1 Assignment
0 Petitions
Accused Products
Abstract
Computer-implemented systems and methods are provided for creating a cluster structure from a data set containing input variables. Global clusters are created within a first stage, by computing a similarity matrix from the data set. A global cluster structure and sub-cluster structure are created within a second stage, where the global cluster structure and the sub-cluster structure are created using a latent variable clustering technique and the cluster structure output is generated by combining the created global cluster structure and the created sub-cluster structure.
-
Citations
20 Claims
-
1. A computer-implemented method for creating a cluster structure from a data set containing a plurality of input variables, comprising:
-
creating global clusters, within a first stage, by computing a similarity matrix from the data set; creating, within a second stage, both a global cluster structure and a sub-cluster structure; wherein the global cluster structure and the sub-cluster structure are created using a latent variable clustering technique; and forming, as output to a computer-readable data store, the cluster structure by combining the created global cluster structure and the created sub-cluster structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-implemented system for creating a cluster structure from a data set containing a plurality of input variables, comprising:
-
first instructions configured to execute on a data processor for creating global clusters, within a first stage, by computing a similarity matrix from the data set; second instructions configured to execute on a data processor for creating, within a second stage, both a global cluster structure and a sub-cluster structure; wherein the global cluster structure and the sub-cluster structure are created using a latent variable clustering technique; and a computer-readable data store for storing the cluster structure that has been generated by combining the created global cluster structure and the created sub-cluster structure.
-
-
20. A computer-readable storage medium having stored thereon cluster data structures that are created based upon a data set containing a plurality of input variables, the data structure comprising:
-
a first data structure containing the data set; a second data structure containing one or more global clusters created by computing a distance matrix from the data set; a third data structure containing a global cluster structure; a fourth data structure containing a sub-cluster structure; and a fifth data structure containing a cluster structure formed by combining the global cluster structure and the sub-cluster structure; wherein the global cluster structure and the sub-cluster structure are created using a latent variable clustering technique.
-
Specification