Neural network based cluster visualization that computes pairwise distances between centroid locations, and determines a projected centroid location in a multidimensional space
First Claim
1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to:
- receive data that includes a plurality of observations with a plurality of data points defined for each observation,wherein each data point of the plurality of data points is associated with a variable to define a plurality of variables;
compute a first plurality of centroid locations for a first set of clusters of a predetermined number of clusters by executing a clustering algorithm with a first portion of the received data and a first input parameter;
compute a second plurality of centroid locations for a second set of clusters of the predetermined number of clusters by executing the clustering algorithm with a second portion of the received data and a second input parameter,wherein the first portion is different from the second portion or the first input parameter is different from the second input parameter,wherein each centroid location of the first plurality of centroid locations and of the second plurality of centroid locations includes a plurality of coordinate values,wherein each coordinate value relates to a single variable of the plurality of variables;
compute distances pairwise between each centroid location of the computed first plurality of centroid locations and each centroid location of the computed second plurality of centroid locations;
select an optimum pairing between the computed first plurality of centroid locations and the computed second plurality of centroid locations based on a minimum distance of the computed pairwise distances;
associate each pair of the selected optimum pairing with a different cluster of a set of composite clusters;
create noised centroid location data by adding a noise value to the computed first plurality of centroid locations and the computed second plurality of centroid locations;
train a multi-layer neural network with the created noised centroid location data;
determine a projected centroid location in a multidimensional space for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations as values of hidden units of a middle layer of the trained multi-layer neural network; and
present a graph for display that indicates the determined, projected centroid location for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations with a different label indicating each pair of the selected optimum pairing,wherein a number of the hidden units of the middle layer defines a number of dimensions of the graph.
1 Assignment
0 Petitions
Accused Products
Abstract
A computing device presents a cluster visualization based on a neural network computation. First centroid locations are computed for first clusters. Second centroid locations are computed for second clusters. Each centroid location includes a plurality of coordinate values where each coordinate value relates to a single variable of a plurality of variables. Distances are computed pairwise between each centroid location. An optimum pairing is selected based on a minimum distance of the computed pairwise distances where each pair is associated with a different cluster of a set of composite clusters. Noised centroid location data is created. A multi-layer neural network is trained with the noised centroid location data. A projected centroid location is determined in a multidimensional space for each centroid location as values of hidden units of a middle layer of the multi-layer neural network. A graph is presented for display that indicates the determined, projected centroid locations.
-
Citations
30 Claims
-
1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to:
-
receive data that includes a plurality of observations with a plurality of data points defined for each observation, wherein each data point of the plurality of data points is associated with a variable to define a plurality of variables; compute a first plurality of centroid locations for a first set of clusters of a predetermined number of clusters by executing a clustering algorithm with a first portion of the received data and a first input parameter; compute a second plurality of centroid locations for a second set of clusters of the predetermined number of clusters by executing the clustering algorithm with a second portion of the received data and a second input parameter, wherein the first portion is different from the second portion or the first input parameter is different from the second input parameter, wherein each centroid location of the first plurality of centroid locations and of the second plurality of centroid locations includes a plurality of coordinate values, wherein each coordinate value relates to a single variable of the plurality of variables; compute distances pairwise between each centroid location of the computed first plurality of centroid locations and each centroid location of the computed second plurality of centroid locations; select an optimum pairing between the computed first plurality of centroid locations and the computed second plurality of centroid locations based on a minimum distance of the computed pairwise distances; associate each pair of the selected optimum pairing with a different cluster of a set of composite clusters; create noised centroid location data by adding a noise value to the computed first plurality of centroid locations and the computed second plurality of centroid locations; train a multi-layer neural network with the created noised centroid location data; determine a projected centroid location in a multidimensional space for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations as values of hidden units of a middle layer of the trained multi-layer neural network; and present a graph for display that indicates the determined, projected centroid location for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations with a different label indicating each pair of the selected optimum pairing, wherein a number of the hidden units of the middle layer defines a number of dimensions of the graph. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computing device comprising:
-
a processor; and a non-transitory computer-readable medium operably coupled to the processor, the computer-readable medium having computer-readable instructions stored thereon that, when executed by the processor, cause the computing device to receive data that includes a plurality of observations with a plurality of data points defined for each observation, wherein each data point of the plurality of data points is associated with a variable to define a plurality of variables; compute a first plurality of centroid locations for a first set of clusters of a predetermined number of clusters by executing a clustering algorithm with a first portion of the received data and a first input parameter; compute a second plurality of centroid locations for a second set of clusters of the predetermined number of clusters by executing the clustering algorithm with a second portion of the received data and a second input parameter, wherein the first portion is different from the second portion or the first input parameter is different from the second input parameter, wherein each centroid location of the first plurality of centroid locations and of the second plurality of centroid locations includes a plurality of coordinate values, wherein each coordinate value relates to a single variable of the plurality of variables; compute distances pairwise between each centroid location of the computed first plurality of centroid locations and each centroid location of the computed second plurality of centroid locations; select an optimum pairing between the computed first plurality of centroid locations and the computed second plurality of centroid locations based on a minimum distance of the computed pairwise distances; associate each pair of the selected optimum pairing with a different cluster of a set of composite clusters; create noised centroid location data by adding a noise value to the computed first plurality of centroid locations and the computed second plurality of centroid locations; train a multi-layer neural network with the created noised centroid location data; determine a projected centroid location in a multidimensional space for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations as values of hidden units of a middle layer of the trained multi-layer neural network; and present a graph for display that indicates the determined, projected centroid location for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations with a different label indicating each pair of the selected optimum pairing, wherein a number of the hidden units of the middle layer defines a number of dimensions of the graph. - View Dependent Claims (21, 22, 23, 24)
-
-
25. A method of presenting a cluster visualization based on a neural network computation, the method comprising:
-
receiving data that includes a plurality of observations with a plurality of data points defined for each observation, wherein each data point of the plurality of data points is associated with a variable to define a plurality of variables; computing, by a computing device, a first plurality of centroid locations for a first set of clusters of a predetermined number of clusters by executing a clustering algorithm with a first portion of the received data and a first input parameter; computing, by the computing device, a second plurality of centroid locations for a second set of clusters of the predetermined number of clusters by executing the clustering algorithm with a second portion of the received data and a second input parameter, wherein the first portion is different from the second portion or the first input parameter is different from the second input parameter, wherein each centroid location of the first plurality of centroid locations and of the second plurality of centroid locations includes a plurality of coordinate values, wherein each coordinate value relates to a single variable of the plurality of variables; computing, by the computing device, distances pairwise between each centroid location of the computed first plurality of centroid locations and each centroid location of the computed second plurality of centroid locations; selecting, by the computing device, an optimum pairing between the computed first plurality of centroid locations and the computed second plurality of centroid locations based on a minimum distance of the computed pairwise distances; associating, by the computing device, each pair of the selected optimum pairing with a different cluster of a set of composite clusters; creating, by the computing device, noised centroid location data by adding a noise value to the computed first plurality of centroid locations and the computed second plurality of centroid locations; training, by the computing device, a multi-layer neural network with the created noised centroid location data; determining, by the computing device, a projected centroid location in a multidimensional space for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations as values of hidden units of a middle layer of the trained multi-layer neural network; and presenting, by the computing device, a graph for display that indicates the determined, projected centroid location for each of the computed first plurality of centroid locations and the computed second plurality of centroid locations with a different label indicating each pair of the selected optimum pairing, wherein a number of the hidden units of the middle layer defines a number of dimensions of the graph. - View Dependent Claims (26, 27, 28, 29, 30)
-
Specification