×

Fast algorithms and metrics for comparing hierarchical clustering information trees and numerical vectors

  • US 8,095,543 B1
  • Filed: 07/31/2008
  • Issued: 01/10/2012
  • Est. Priority Date: 07/31/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for determining a similarity between two data sets, comprising:

  • determining a first list of data clusters for a first hierarchically-organized data set;

    determining a second list of data clusters for a second hierarchically-organized data set;

    removing a master cluster from consideration if the first and second data sets have all common elements;

    determining a similarity between the first and second data sets by calculating a maximum flow between the first list of data clusters and the second list of data clusters;

    determining a maximum number of redundant elements for the first and second data sets; and

    dividing the maximum number of redundant elements by the maximum matching flow to arrive at a distance metric.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×