INFORMATION PROCESSING APPARATUS, CLUSTERING METHOD, AND RECORDING MEDIUM STORING CLUSTERING PROGRAM

US 20140149412A1
Filed: 10/28/2013
Published: 05/29/2014
Est. Priority Date: 11/26/2012
Status: Active Grant

First Claim

Patent Images

1. An information processing apparatus that clusters an input data set, comprising:

a memory to store, for each of a plurality of reference data sets that is previously clustered, a reference parameter used for clustering the reference data set, the reference parameter being a model parameter of mixture distribution; and

a processor tosearch the memory to obtain the reference parameter of at least one reference data set that is similar to the input data set;

determine an initial value of a model parameter of mixture distribution of the input data set, based on the reference parameter of the at least one reference data set;

modify the initial value of the model parameter of mixture distribution of the input data set, so as to match a probability density distribution of the input data set to generate an updated initial value; and

cluster the probability density distribution of the input data set on a feature space of the input data set, using the updated initial value of the model parameter of mixture distribution.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An information processing apparatus, a clustering method, and a clustering program stored on a recording medium, each of which determines an initial value of model parameter of an input data set based on a model parameter of a reference data set that is similar to the input data set and is previously clustered, modifies the initial value so as to match the input data set, and to obtain a clustering result of the input data set using the updated initial value of model parameter.

20 Citations

View as Search Results

17 Claims

1. An information processing apparatus that clusters an input data set, comprising:
- a memory to store, for each of a plurality of reference data sets that is previously clustered, a reference parameter used for clustering the reference data set, the reference parameter being a model parameter of mixture distribution; and
  
  a processor tosearch the memory to obtain the reference parameter of at least one reference data set that is similar to the input data set;
  
  determine an initial value of a model parameter of mixture distribution of the input data set, based on the reference parameter of the at least one reference data set;
  
  modify the initial value of the model parameter of mixture distribution of the input data set, so as to match a probability density distribution of the input data set to generate an updated initial value; and
  
  cluster the probability density distribution of the input data set on a feature space of the input data set, using the updated initial value of the model parameter of mixture distribution.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The information processing apparatus of claim 1, wherein the processor obtains a plurality of reference parameters based on the degree of similarity between the input data set and the reference data set as the at least one reference data set that is similar to the input data set, and combines the plurality of reference parameters based on the degree of similarity to determine the initial value of the model parameter of mixture distribution of the input data set.
  - 3. The information processing apparatus of claim 2, wherein the processor calculates the degree of similarity based on a feature of the input data set and a feature of the reference data set, and searches the at least one reference data set based on the degree of similarity.
  - 4. The information processing apparatus of claim 3, wherein the feature used for calculating the degree of similarity between the input data set and the reference data set in searching the at least one reference data set, differs from a feature used for clustering the input data set.
  - 5. The information processing apparatus of claim 4, wherein the feature used for calculating the degree of similarity between the input data set and the reference data set includes at least one of a feature indicating the shape of a data point in the data set, and a feature indicating the texture of the data point in the data set.
  - 6. The information processing apparatus of claim 4, wherein the feature used for clustering the input data set includes a feature indicating the color of a data point in the data set.
  - 7. The information processing apparatus of claim 1, wherein the model of mixture distribution is a Gaussian mixture distribution model.
  - 8. The information processing apparatus of claim 1, wherein the input data set is an image, and each data point in the input data set is a pixel or a set of a plurality of pixels in the image.

9. A method of clustering an input data set, comprising:
- storing, in a memory, for each of a plurality of reference data sets that is previously clustered, a reference parameter used for clustering the reference data set, the reference parameter being a model parameter of mixture distribution;
  
  searching the memory to obtain the reference parameter of at least one reference data set that is similar to the input data set;
  
  determining an initial value of a model parameter of mixture distribution of the input data set, based on the reference parameter of the at least one reference data set;
  
  modifying the initial value of the model parameter of mixture distribution of the input data set, so as to match a probability density distribution of the input data set to generate an updated initial value; and
  
  clustering the probability density distribution of the input data set on a feature space of the input data set, using the updated initial value of the model parameter of mixture distribution.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The method of claim 9, wherein, when a plurality of reference parameters is obtained based on the degree of similarity between the input data set and the reference data set as the at least one reference data set that is similar to the input data set, the method further comprises:
    - combining the plurality of reference parameters based on the degree of similarity to determine the initial value of the model parameter of mixture distribution of the input data set.
  - 11. The method of claim 10, further comprising:
    - calculating the degree of similarity based on a feature of the input data set and a feature of the reference data set,wherein the searching searches the at least one reference data set based on the degree of similarity.
  - 12. The method of claim 11, wherein the feature used for calculating the degree of similarity between the input data set and the reference data set in searching the at least one reference data set, differs from a feature used for clustering the input data set.
  - 13. The method of claim 12, wherein the feature used for calculating the degree of similarity between the input data set and the reference data set includes at least one of a feature indicating the shape of a data point in the data set, and a feature indicating the texture of the data point in the data set.
  - 14. The method of claim 12, wherein the feature used for clustering the input data set includes a feature indicating the color of a data point in the data set.
  - 15. The method of claim 9, wherein the model of mixture distribution is a Gaussian mixture distribution model.
  - 16. The method of claim 9, wherein the input data set is an image, and each data point in the input data set is a pixel or a set of a plurality of pixels in the image.

17. A non-transitory recording medium storing a plurality of instructions which, when executed by a processor, cause the processor to perform a method of clustering an input data set, the method comprising:
- storing, in a memory, for each of a plurality of reference data sets that is previously clustered, a reference parameter used for clustering the reference data set, the reference parameter being a model parameter of mixture distribution;
  
  searching the memory to obtain the reference parameter of at least one reference data set that is similar to the input data set;
  
  determining an initial value of a model parameter of mixture distribution of the input data set, based on the reference parameter of the at least one reference data set;
  
  modifying the initial value of the model parameter of mixture distribution of the input data set, so as to match a probability density distribution of the input data set to generate an updated initial value; and
  
  clustering the probability density distribution of the input data set on a feature space of the input data set, using the updated initial value of the model parameter of mixture distribution.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ricoh Company Limited
Original Assignee
Ricoh Company Limited
Inventors
NAKAMURA, Satoshi

Granted Patent

US 9,519,660 B2
Time in Patent Office

Days
Field of Search
US Class Current

707/737
CPC Class Codes

G06F 16/355   Class or cluster creation o...

G06F 16/583   using metadata automaticall...

G06F 18/23213   with fixed number of cluste...

G06F 18/24137   Distances to cluster centroïds

G06V 10/763   Non-hierarchical techniques...

G06V 10/764   using classification, e.g. ...

G06V 20/35   Categorising the entire sce...

INFORMATION PROCESSING APPARATUS, CLUSTERING METHOD, AND RECORDING MEDIUM STORING CLUSTERING PROGRAM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

20 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

INFORMATION PROCESSING APPARATUS, CLUSTERING METHOD, AND RECORDING MEDIUM STORING CLUSTERING PROGRAM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

20 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links