Accelerated discrete distribution clustering under wasserstein distance

US 10,013,477 B2
Filed: 09/30/2016
Issued: 07/03/2018
Est. Priority Date: 11/19/2012
Status: Active Grant

First Claim

Patent Images

1. A method of clustering data objects representative of images, videos, biological processes, genetic sequences, or documents using a scalable optimization technique for scaling computational cost of the clustering to a reduced level, comprising the steps of:

a) providing, at a computer processor including parallel processors, a set of data objects, each object being representative of an image, a video, a biological process, genetic sequence, or a document;

b) performing an initial data segmentation on the set of data objects to divide the data into trunks before performing a discrete distribution (D2) clustering operation, wherein one of the parallel processors performs the initial data segmentation step and distributes the data trunks to each of the parallel processors;

c) performing, by all the parallel processors, a discrete distribution (D2) clustering operation, including;

i) optimizing Wasserstein centroids by using a scalable and parallel optimization technique, each data object being handled on one of the parallel processors that contains the data object, local aggregated results based on each data trunk being communicated to all of the parallel processors that uses the local aggregated results to compute the Wasserstein centroids for all the data objects,ii) assigning each data object to the nearest Wasserstein centroid, the nearest Wasserstein centroid being a label of the data object, andiii) iterating i) and ii) until a single segmentation is achieved, the number of Wasserstein centroids is reduced to a predefined number, a predefined threshold based on the distances of the data objects to the Wasserstein centroids is satisfied, the number of changed labels of objects is less than a predefined number, or the number of iterations reaches a predefined number;

wherein the scalable optimization technique is a Bregman alternating direction method of multiplier (B-ADMM) or a ADMM method or a subgradient descent method, such that the computational cost of the clustering is reduced to a level which can be handled by each of the parallel processors,the sub gradient descent method being a method that optimizes for Wasserstein centroids using subgradients in optimal transport problems solved via linear programming,the ADMM method optimizing for Wasserstein centroids using updates of ADMM by decoupling one set of constraint in optimal transport into a distributed formulation, such that a number of submodules are solved from quadratic programming,the B-ADMM approach optimizing for Wasserstein centroids using updates of B-ADMM, which is a variant of ADMM, by decoupling two sets of constraints in an optimal transport into a distributed formulation, such that a number of submodules are solved in a closed formula; and

d) outputting information regarding a way in which the objects are clustered in terms of similarity.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Computationally efficient accelerated D2-clustering algorithms are disclosed for clustering discrete distributions under the Wasserstein distance with improved scalability. Three first-order methods include subgradient descent method with re-parametrization, alternating direction method of multipliers (ADMM), and a modified version of Bregman ADMM. The effects of the hyper-parameters on robustness, convergence, and speed of optimization are thoroughly examined. A parallel algorithm for the modified Bregman ADMM method is tested in a multi-core environment with adequate scaling efficiency subject to hundreds of CPUs, demonstrating the effectiveness of AD2-clustering.

Citations

5 Claims

1. A method of clustering data objects representative of images, videos, biological processes, genetic sequences, or documents using a scalable optimization technique for scaling computational cost of the clustering to a reduced level, comprising the steps of:
- a) providing, at a computer processor including parallel processors, a set of data objects, each object being representative of an image, a video, a biological process, genetic sequence, or a document;
  
  b) performing an initial data segmentation on the set of data objects to divide the data into trunks before performing a discrete distribution (D2) clustering operation, wherein one of the parallel processors performs the initial data segmentation step and distributes the data trunks to each of the parallel processors;
  
  c) performing, by all the parallel processors, a discrete distribution (D2) clustering operation, including;
  
  i) optimizing Wasserstein centroids by using a scalable and parallel optimization technique, each data object being handled on one of the parallel processors that contains the data object, local aggregated results based on each data trunk being communicated to all of the parallel processors that uses the local aggregated results to compute the Wasserstein centroids for all the data objects,ii) assigning each data object to the nearest Wasserstein centroid, the nearest Wasserstein centroid being a label of the data object, andiii) iterating i) and ii) until a single segmentation is achieved, the number of Wasserstein centroids is reduced to a predefined number, a predefined threshold based on the distances of the data objects to the Wasserstein centroids is satisfied, the number of changed labels of objects is less than a predefined number, or the number of iterations reaches a predefined number;
  
  wherein the scalable optimization technique is a Bregman alternating direction method of multiplier (B-ADMM) or a ADMM method or a subgradient descent method, such that the computational cost of the clustering is reduced to a level which can be handled by each of the parallel processors,the sub gradient descent method being a method that optimizes for Wasserstein centroids using subgradients in optimal transport problems solved via linear programming,the ADMM method optimizing for Wasserstein centroids using updates of ADMM by decoupling one set of constraint in optimal transport into a distributed formulation, such that a number of submodules are solved from quadratic programming,the B-ADMM approach optimizing for Wasserstein centroids using updates of B-ADMM, which is a variant of ADMM, by decoupling two sets of constraints in an optimal transport into a distributed formulation, such that a number of submodules are solved in a closed formula; and
  
  d) outputting information regarding a way in which the objects are clustered in terms of similarity.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the objects to be clustered are mathematically represented as bags of weighted vectors.
  - 3. The method of claim 1, wherein the objects to be clustered are mathematically represented as a histogram.
  - 4. The method of claim 1, wherein the segmentation of the data points is based upon adjacency.
  - 5. The method of claim 1, wherein the discrete distribution (D2) clustering operation is performed by physically separate parallel processors or separate cores of an integrated device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Penn State Research Foundation (Pennsylvania State University)
Original Assignee
Penn State Research Foundation (Pennsylvania State University)
Inventors
Ye, Jianbo, Li, Jia, Wang, James Z.
Primary Examiner(s)
Thomas, Ashish K
Assistant Examiner(s)
Mahmood, Rezwanul

Application Number

US15/282,947
Publication Number

US 20170083608A1
Time in Patent Office

641 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/285   Clustering or classification

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

Accelerated discrete distribution clustering under wasserstein distance

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Accelerated discrete distribution clustering under wasserstein distance

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links