×

Clustering system, clustering method, clustering program and attribute estimation system using clustering system

  • US 20070219779A1
  • Filed: 06/22/2006
  • Published: 09/20/2007
  • Est. Priority Date: 03/20/2006
  • Status: Active Grant
First Claim
Patent Images

1. A clustering system that clusters a language model group including language models that correspond to a plurality of attribute values, each language model being associated with an attribute value showing a predetermined attribute of humans and having a plurality of entries including vocabularies appearing as speech uttered by or text written by one or more humans having attributes represented with the attribute values and data representing occurrence frequencies of the vocabularies, the clustering system comprising:

  • a union language model preparation unit that generates union data representing a union of vocabularies included in the language model group and prepares a union language model including the union of the vocabularies and occurrence frequencies of the vocabularies using the union data, the union language model being prepared for each language model included in the language model group, so as to prepare a union language model group; and

    a clustering unit that performs clustering with respect to the union language model group based on a predetermined method, so as to classify the union language model group into a plurality of clusters and generates cluster data representing one or more of the union language models included in each cluster, wherein when the union language model preparation unit prepares a union language model for a certain language model, the union language model preparation unit records vocabularies included in the certain language model among the vocabularies included in the union data associated with occurrence frequencies of the vocabularies in the certain language model as entries in the union language model, and records vocabularies not included in the certain language model among the vocabularies included in the union data associated with data showing that an occurrence frequency is 0 as entries in the union language model.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×