STANDARD-MODEL GENERATION FOR SPEECH RECOGNITION USING A REFERENCE MODEL

US 20090271201A1
Filed: 07/08/2009
Published: 10/29/2009
Est. Priority Date: 11/21/2002
Status: Abandoned Application

First Claim

Patent Images

1. A standard model creating apparatus for creating a standard model which shows an acoustic characteristic having a specific attribute and is used for a speech recognition device included in an electronic apparatus used by a user, the standard model creating apparatus using a probability model that expresses a frequency parameter showing an acoustic characteristic as an output probability, the standard model creating apparatus comprising:

a reference model storing unit configured to store a plurality of reference models which are probability models showing an acoustic characteristic having a specific attribute; and

a standard model creating unit configured to create the standard model by calculating statistics of the standard model using statistics of the plurality of reference models stored in said reference model storing unit,wherein said standard model creating unit includes;

a standard model structure determining unit configured to determine a structure of the standard model which is to be created, based on specification information regarding specifications of the electronic apparatus;

an initial standard model creating unit configured to determine initial values of the statistics specifying the standard model whose structure has been determined; and

a statistics estimating unit configured to estimate and calculate the statistics of the standard model so as to maximize or locally maximize a probability or a likelihood of the standard model, whose initial values have been determined, with respect to the plurality of reference models,wherein the plurality of reference models and the standard model are expressed using at least one Gaussian distribution, andsaid standard model structure determining unit is configured to determine a number of statistics of the standard model including at least a number of Gaussian mixture distributions as the structure of the standard model, based on the specification information indicating which of precision and speed is prioritized in speech recognition by the speech recognition device.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A standard model creating apparatus which provides a high-precision standard model used for pattern recognition such as speech recognition, character recognition, or image recognition using a probability model based on a hidden Markov model, Bayesian theory, or linear discrimination analysis; intention interpretation using a probability model such as a Bayesian net; data-mining performed using a probability model; and so forth. The standard model creating apparatus includes a reference model preparing unit that prepares at least one reference model; a reference model storing unit that stores the reference model prepared by the reference model preparing unit (; and a standard model creating unit that creates a standard model by calculating statistics of the standard model so as to maximize or locally maximize the probability or likelihood with respect to the reference model stored in the reference model storing unit.

39 Citations

View as Search Results

9 Claims

1. A standard model creating apparatus for creating a standard model which shows an acoustic characteristic having a specific attribute and is used for a speech recognition device included in an electronic apparatus used by a user, the standard model creating apparatus using a probability model that expresses a frequency parameter showing an acoustic characteristic as an output probability, the standard model creating apparatus comprising:
- a reference model storing unit configured to store a plurality of reference models which are probability models showing an acoustic characteristic having a specific attribute; and
  
  a standard model creating unit configured to create the standard model by calculating statistics of the standard model using statistics of the plurality of reference models stored in said reference model storing unit,wherein said standard model creating unit includes;
  
  a standard model structure determining unit configured to determine a structure of the standard model which is to be created, based on specification information regarding specifications of the electronic apparatus;
  
  an initial standard model creating unit configured to determine initial values of the statistics specifying the standard model whose structure has been determined; and
  
  a statistics estimating unit configured to estimate and calculate the statistics of the standard model so as to maximize or locally maximize a probability or a likelihood of the standard model, whose initial values have been determined, with respect to the plurality of reference models,wherein the plurality of reference models and the standard model are expressed using at least one Gaussian distribution, andsaid standard model structure determining unit is configured to determine a number of statistics of the standard model including at least a number of Gaussian mixture distributions as the structure of the standard model, based on the specification information indicating which of precision and speed is prioritized in speech recognition by the speech recognition device.
- View Dependent Claims (2, 3, 4, 5, 8, 9)
- - 2. The standard model creating apparatus according to claim 1,wherein the specification information indicates at least one of a type of an application program running on the electronic apparatus, and specifications of the electronic apparatus.
  - 3. The standard model creating apparatus according to claim 1, further comprising:
    - a specification information holding unit configured to store an application/specifications correspondence database showing a correspondence between an application program which uses the standard model and specifications of the standard model,wherein said standard model structure determining unit is configured to read specifications corresponding to an application program to be activated from the application/specifications correspondence database held by said specification information holding unit, and to determine the structure of the standard model based on the read specifications.
  - 4. The standard model creating apparatus according to claim 1, further comprising:
    - a specification information creating unit configured to create the specification information,wherein said standard model structure determining unit is configured to determine the structure of the standard model based on the created specification information.
  - 5. The standard model creating apparatus according to claim 1, wherein the standard model creating apparatus is connected to a terminal apparatus via a communication channel, and further comprises:
    - a specification information receiving unit configured to receive the specification information from the terminal apparatus,wherein said standard model structure determining unit is configured to determine the structure of the standard model based on the received specification information.
  - 8. The standard model creating apparatus according to claim 4,wherein said specification information creating unit is configured to create the specification information with an N1 number of the Gaussian distributions when an instruction that the electronic apparatus is to perform quick speech recognition is obtained from the user, and to create the specification information with an N2 (>
    - N1) number of the Gaussian distributions when an instruction that the electronic apparatus is to perform precise speech recognition is obtained from the user, andsaid standard model structure determining unit is configured to determine the number of the Gaussian mixture distributions according to the specification information created by said specification information creating unit.
  - 9. The standard model creating apparatus according to claim 1,wherein said standard model structure determining unit is configured to determine a Gaussian mixture distribution having an Mf (Mf≧
    - 1) number of mixture distributions as the structure of the standard model, andsaid statistics estimating unit is configured to calculate at least one of a mixture weighting coefficient ω
      
      _f(m)(m=1, 2, . . . , M_f), a mean value μ
      
      _f(m)(m=1, 2, . . . , M_f), and a variance σ
      
      _f(m)²(m=1, 2, . . . , M_f) which are the statistics of the standard model $\sum_{m = 1}^{M_{f}} ω_{f (m)} f (x; μ_{f (m)}, σ_{f (m)}^{2})$ (where f(x;
      
      μ
      
      _f(m),σ
      
      _f(m)²) (m=1, 2, . . . , M_f) represents a Gaussian distribution, and X represents input data) represented by the Gaussian mixture distribution so as to maximize or locally maximize a likelihood $\log P = \sum_{l = 1}^{N_{g}} \int_{- \infty}^{\infty} \log [\sum_{m = 1}^{M_{f}} ω_{f (m)} f (x; μ_{f (m)}, σ_{f (m)}^{2})] {\sum_{l = 1}^{L_{g (i)}} υ_{g (i, l)} g (x; μ_{g (i, l)}, σ_{g (i, l)}^{2})} \partial x$ of the standard model, with respect to Ng (Ng 2) reference models ≧
      
      $\sum_{l = 1}^{L_{g (i)}} υ_{g (i, l)} g (x; μ_{g (i, l)}, σ_{g (i, l)}^{2}) (i = 1, 2, \dots, N_{g})$ (where g(x;
      
      μ
      
      _g(i,l),σ
      
      _g(i,l)²) (i=1, 2, . . . , N_g, l=1, 2, . . . , L_(i)) represents a Gaussian distribution L_g(i)(i=1, 2, . . . , N_g) represents a mixture distribution of each of the reference models, ν
      
      _g(i,l)(l=1, 2, . . . , L_g(l)) represents a mixture weighting coefficient, μ
      
      _g(i,l)(l=1, 2, . . . , L_g(l)) represents a mean value, and σ
      
      _g(i,l)²(l=1, 2, . . . , L_g(l)) represents a variance).

6. A method of creating a standard model which shows an acoustic characteristic having a specific attribute and is used for a speech recognition device included in an electronic apparatus used by a user, the method using a probability model that expresses a frequency parameter showing an acoustic characteristic as an output probability, the method comprising:
- a reference model reading step of reading plurality of reference models from a reference model storing unit which is configured to store a plurality of reference models which are probability models showing an acoustic characteristic having a specific attribute; and
  
  a standard model creating step of creating the standard model by calculating statistics of the standard model using statistics of the plurality of reference models that has been read,wherein the standard model creating step includes;
  
  a standard model structure determining sub-step of determining a structure of the standard model which is to be created, based on specification information regarding specifications of the electronic apparatus;
  
  an initial standard model creating sub-step of determining initial values of the statistics specifying the standard model whose structure has been determined; and
  
  a statistics estimating sub-step of estimating and calculating the statistics of the standard model so as to maximize or locally maximize a probability or a likelihood of the standard model, whose initial values have been determined, with respect to plurality of reference models,wherein the plurality of reference models and the standard model are expressed using at least one Gaussian distribution, andsaid standard model structure determining unit is configured to determine a number of statistics of the standard model including at least a number of Gaussian mixture distributions as the structure of the standard model, based on the specification information indicating which of precision and speed is prioritized in speech recognition by the speech recognition device.

7. A program stored on a computer-readable medium which when executed causes a standard model creating apparatus to perform steps for creating a standard model which shows an acoustic characteristic having a specific attribute and is used for a speech recognition device included in an electronic apparatus used by a user, the program using a probability model that expresses a frequency parameter showing an acoustic characteristic as an output probability, the steps comprising:
- a reference model reading step of reading plurality of reference models from a reference model storing unit which is configured to store a plurality of reference models which are probability models showing an acoustic characteristic having a specific attribute; and
  
  a standard model creating step of creating the standard model by calculating statistics of the standard model using statistics of the plurality of reference models that has been read,wherein the standard model creating step includes;
  
  a standard model structure determining sub-step configured to determine a structure of the standard model which is to be created, based on specification information regarding specifications of the electronic apparatus;
  
  an initial standard model creating sub-step of determining initial values of the statistics specifying the standard model whose structure has been determined; and
  
  a statistics estimating sub-step of estimating and calculating the statistics of the standard model so as to maximize or locally maximize a probability or a likelihood of the standard model, whose initial values have been determined, with respect to the plurality of reference modelswherein the plurality of reference models and the standard model are expressed using at least one Gaussian distribution, andsaid standard model structure determining unit is configured to determine a number of statistics of the standard model including at least a number of Gaussian mixture distributions as the structure of the standard model, based on the specification information indicating which of precision and speed is prioritized in speech recognition by the speech recognition device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Shinichi Yoshizawa
Original Assignee
Shinichi Yoshizawa
Inventors
YOSHIZAWA, Shinichi

Application Number

US12/499,302
Publication Number

US 20090271201A1
Time in Patent Office

Days
Field of Search
US Class Current

704/255
CPC Class Codes

G10L 15/06 Creation of reference templ...

STANDARD-MODEL GENERATION FOR SPEECH RECOGNITION USING A REFERENCE MODEL

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

39 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

STANDARD-MODEL GENERATION FOR SPEECH RECOGNITION USING A REFERENCE MODEL

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links