Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof
First Claim
1. An acoustic model adaptation method comprising:
- creating monophone-based acoustic models by using training data by a native speaker;
expanding the monophone-based acoustic models to triphone-based acoustic models;
performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and
performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech,wherein the performing a pronunciation variation analysis includes,reducing a number of triphone-based acoustic models by using a state-tying scheme;
creating a speech recognition system which has been trained by a native speaker'"'"'s speech by increasing a mixture density of the state-tied triphone-based acoustic models;
making the speech recognition system recognize a non-native speaker'"'"'s speech, and then creating a monophone confusion matrix; and
obtaining the variation pronunciation by analyzing the monophone confusion matrix.
3 Assignments
0 Petitions
Accused Products
Abstract
The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker'"'"'s speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker'"'"'s speech. Thereafter, based on variation pronunciation of a non-native speaker'"'"'s speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker'"'"'s speech while reducing the degradation of recognition performance for a native speaker'"'"'s speech.
31 Citations
8 Claims
-
1. An acoustic model adaptation method comprising:
-
creating monophone-based acoustic models by using training data by a native speaker; expanding the monophone-based acoustic models to triphone-based acoustic models; performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech, wherein the performing a pronunciation variation analysis includes, reducing a number of triphone-based acoustic models by using a state-tying scheme; creating a speech recognition system which has been trained by a native speaker'"'"'s speech by increasing a mixture density of the state-tied triphone-based acoustic models; making the speech recognition system recognize a non-native speaker'"'"'s speech, and then creating a monophone confusion matrix; and obtaining the variation pronunciation by analyzing the monophone confusion matrix. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An acoustic model adaptation method comprising:
-
creating monophone-based acoustic models by using training data by a native speaker; expanding the monophone-based acoustic models to triphone-based acoustic models; performing a pronunciation variation analysis for examining a variation pronunciation of a non-native speaker through pronunciation analysis of a non-native speaker'"'"'s speech by using the triphone-based acoustic models; and performing adaptation of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech, wherein the performing adaption of an acoustic model by using the analyzed variation pronunciation so that the acoustic model may be adapted for the non-native speaker'"'"'s speech includes, state-tying the created triphone-based acoustic models according to whether there is pronunciation variation by the non-native speaker; and increasing a mixture density of the state-tied triphone-based acoustic models. - View Dependent Claims (7, 8)
-
Specification