×

Grouping of data points in data analysis for graph generation

  • US 10,599,669 B2
  • Filed: 03/14/2016
  • Issued: 03/24/2020
  • Est. Priority Date: 01/14/2014
  • Status: Active Grant
First Claim
Patent Images

1. A non-transitory computer readable medium including executable instructions, the instructions being executable by a processor to perform a method, the method comprising:

  • receiving input data;

    performing a similarity function on the input data to map the input data into a reference space to create reference data in the reference space, wherein the similarity function includes a distance function;

    identifying groupings of the reference data in the reference space using a resolution function;

    identifying nodes using a metric of the input data associated with groupings of the reference data, each node including at least some of the input data;

    building a first partition of subsets of the input data by hierarchical clustering creating a set of data trees, each subset of the first partition containing one or more nodes that are exclusive of other subsets of the first partition;

    computing a first subset score for each subset of the first partition using a scoring function;

    identifying a next partition from the hierarchical clustering including all of the nodes of the first partition, the next partition including at least one subset that includes all of the nodes of two or more subsets of the first partition, each particular subset of the next partition being related to one or more subsets of a previously generated partition if that particular subset shares membership of at least one node with the one or more subsets of the previously generated partition;

    computing a second subset score for each subset of the next partition using the scoring function;

    defining a max score for each particular subset of the next partition using a max score function, each max score being based on maximal subset scores of that particular subset of the next partition and at least the subsets of the first partition related to that particular subset;

    selecting output subsets from all subsets of the next partition and the previously generated partitions including the first partition, the output subsets together including all elements of the first partition, selection of each of the output subsets being made, at least in part, using a maximum score of previously computed subset scores, the maximum score being a largest score of all subset scores of the next partition and previously generated partitions including the first partition; and

    generating a visualization report including graphical objects indicating an output partition containing the output subsets, the output subsets of the output partition being associated with the nodes, each subset of the output partition containing nodes being exclusive of other subsets of the output partition.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×