Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

US 8,515,753 B2
Filed: 03/30/2007
Issued: 08/20/2013
Est. Priority Date: 03/31/2006
Status: Expired due to Fees

First Claim

Patent Images

1. An acoustic model adaptation method comprising:

creating monophone-based acoustic models by using training data by a native speaker;

expanding the monophone-based acoustic models to triphone-based acoustic models;

performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and

performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech,wherein the performing a pronunciation variation analysis includes,reducing a number of triphone-based acoustic models by using a state-tying scheme;

creating a speech recognition system which has been trained by a native speaker'"'"'s speech by increasing a mixture density of the state-tied triphone-based acoustic models;

making the speech recognition system recognize a non-native speaker'"'"'s speech, and then creating a monophone confusion matrix; and

obtaining the variation pronunciation by analyzing the monophone confusion matrix.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker'"'"'s speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker'"'"'s speech. Thereafter, based on variation pronunciation of a non-native speaker'"'"'s speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker'"'"'s speech while reducing the degradation of recognition performance for a native speaker'"'"'s speech.

31 Citations

View as Search Results

8 Claims

1. An acoustic model adaptation method comprising:
- creating monophone-based acoustic models by using training data by a native speaker;
  
  expanding the monophone-based acoustic models to triphone-based acoustic models;
  
  performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and
  
  performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech,wherein the performing a pronunciation variation analysis includes,reducing a number of triphone-based acoustic models by using a state-tying scheme;
  
  creating a speech recognition system which has been trained by a native speaker'"'"'s speech by increasing a mixture density of the state-tied triphone-based acoustic models;
  
  making the speech recognition system recognize a non-native speaker'"'"'s speech, and then creating a monophone confusion matrix; and
  
  obtaining the variation pronunciation by analyzing the monophone confusion matrix.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The acoustic model adaptation method as claimed in claim 1, wherein the expanding the monophone-based acoustic models to the triphone-based acoustic models comprises a training the acoustic models by using the training data by the native speaker wherein the training the acoustic models by using the native speaker, wherein the training the acoustic models comprises:
    - locating all triphone-based acoustic models having a central phone (b) of a triphone (a−
      
      b+c) in a parent node of a decision tree;
      
      locating each of the triphone-based acoustic models, which has been located in the parent node, in a corresponding terminal node through a decision questionnaire; and
      
      tying the triphone-based acoustic models located in terminal nodes as one representative acoustic model.
  - 3. The acoustic model adaptation method as claimed in claim 1, wherein, in the monophone confusion matrix, a row array includes pronunciations which must be recognized, and a column array includes pronunciations recognized from a non-native speaker'"'"'s speech.
  - 4. The acoustic model adaptation method as claimed in claim 1, wherein the obtaining the variation pronunciation by analyzing the monophone confusion matrix is achieved by taking an element having a largest value among elements of the confusion matrix.
  - 5. An apparatus for recognizing a non-native speaker'"'"'s speech by using acoustic models which have been created according to the acoustic model adaptation method claimed in claim 1.

6. An acoustic model adaptation method comprising:
- creating monophone-based acoustic models by using training data by a native speaker;
  
  expanding the monophone-based acoustic models to triphone-based acoustic models;
  
  performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and
  
  performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech,wherein the performing adaption of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech includes,state-tying the created triphone-based acoustic models according to whether there is pronunciation variation by the non-native speaker; and
  
  increasing a mixture density of the state-tied triphone-based acoustic models.
- View Dependent Claims (7, 8)
- - 7. The acoustic model adaptation method as claimed in claim 6, wherein, when there is no pronunciation variation by the non-native speaker, a state-tying process, which has been used in creating the speech recognition system having been trained by the native speaker'"'"'s speech, is used for state-tying the created triphone-based acoustic models.
  - 8. The acoustic model adaptation method as claimed in claim 6, wherein, when there is pronunciation variation by the non-native speaker, the state-tying the created triphone-based acoustic models comprises:
    - locating all triphone-based acoustic models each of which has a variation pronunciation (b′
      
      ) by a non-native speaker as a central phone (b′
      
      ) thereof, as well as all triphone-based acoustic models each of which has a monophone to be state-tied as a central phone (b) thereof, in a parent node; and
      
      disposing each of the triphone-based acoustic models, which have been located in the parent node, in a corresponding terminal node, and tying the triphone-based acoustic models as one representative acoustic model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Gwangju Institute of Science and Technology
Original Assignee
Gwangju Institute of Science and Technology
Inventors
Kim, Hong Kook, Oh, Yoo Rhee, Yoon, Jae Sam
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US12/225,801
Publication Number

US 20090119105A1
Time in Patent Office

2,335 Days
Field of Search

704/255, 704/251, 704/278, 704/276, 704/270, 704/254, 704/249, 704/244, 704/240, 704/237, 704/233, 704/235, 704/200, 700/1, 370/259, 706/11
US Class Current

704/244
CPC Class Codes

G10L 15/07 to the speaker

Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

31 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

31 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links