×

Systems and methods for clustering data samples

  • US 9,152,703 B1
  • Filed: 02/28/2013
  • Issued: 10/06/2015
  • Est. Priority Date: 02/28/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for clustering data samples, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:

  • identifying a plurality of samples to cluster;

    identifying a plurality of candidate features for clustering the plurality of samples;

    identifying a plurality of candidate distance functions for clustering the plurality of samples;

    selecting a distance function from the plurality of candidate distance functions for clustering the plurality of samples at least in part by;

    selecting a set of features from the plurality of candidate features for clustering the plurality of samples based at least in part on determining that a result of clustering a training set of samples using the set of features and the distance function fits an expected clustering of the training set of samples more closely than an additional result of clustering the training set of samples using an alternative set of features from the plurality of candidate features and the distance function, according to a predefined clustering accuracy metric;

    determining that the result of clustering the training set of samples using the set of features and the distance function fits the expected clustering of the training set of samples more closely than a best result of clustering the training set of samples for each candidate distance function, aside from the distance function, within the plurality of candidate distance functions, according to the predefined clustering accuracy metric;

    clustering the plurality of samples using the set of features and the distance function.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×