Speech Recognition Based on a Multilingual Acoustic Model

US 20100131262A1
Filed: 11/25/2009
Published: 05/27/2010
Est. Priority Date: 11/27/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for generating a multilingual acoustic model for use in a speech recognition system, comprising:

providing to a processor from memory a main acoustic model including a set of probability distribution functions and a probabilistic state sequence model;

providing to the processor from memory at least one second acoustic model including a set of probability distribution functions and a probabilistic state sequence model;

in a computer process, replacing each of the probability distribution function of the at least second acoustic model by one of the probability distribution functions from the main codebook and/or each state of the probabilistic state sequence model from the second acoustic model by a state of the probabilistic state sequence model from the main acoustic model based upon a criteria set to form a modified second acoustic model; and

in a computer process, combining the main acoustic model and the at least one modified second acoustic model to form the multilingual acoustic model.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement. The processor then combines the main acoustic model and the at least one modified second acoustic model to obtain the multilingual acoustic model.

30 Citations

View as Search Results

40 Claims

1. A computer-implemented method for generating a multilingual acoustic model for use in a speech recognition system, comprising:
- providing to a processor from memory a main acoustic model including a set of probability distribution functions and a probabilistic state sequence model;
  
  providing to the processor from memory at least one second acoustic model including a set of probability distribution functions and a probabilistic state sequence model;
  
  in a computer process, replacing each of the probability distribution function of the at least second acoustic model by one of the probability distribution functions from the main codebook and/or each state of the probabilistic state sequence model from the second acoustic model by a state of the probabilistic state sequence model from the main acoustic model based upon a criteria set to form a modified second acoustic model; and
  
  in a computer process, combining the main acoustic model and the at least one modified second acoustic model to form the multilingual acoustic model.
- View Dependent Claims (2, 3, 4, 10, 14, 16, 17, 18, 37, 39)
- - 2. A computer implemented method for generating a multilingual acoustic model according to claim 1 wherein the criteria set is a distance measurement.
  - 3. A computer-implemented method according to claim 1, wherein the probability distribution function is a Gaussian distribution.
  - 4. A computer implemented method according to claim 1, wherein the probabilistic state sequence model is a Hidden Markov Model.
  - 10. The computer implemented method according to claim 1, further comprising:
    - in a computer process, replacing each of the second probability distribution functions of the at least one second acoustic model by the respective closest one of the first probability distribution functions to obtain a first modified second acoustic model;
      
      in a computer process, replacing each of the second states of the second probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the first probabilistic state sequence model of the main acoustic model to obtain a second modified second acoustic model;
      
      in a computer process, weighting the first modified second acoustic model by a first weight;
      
      in a computer process, weighting the second modified second acoustic model by a second weight; and
      
      in a computer process, combining the first modified second acoustic model weighted by the first weight and the second modified second acoustic model weighted by the second weight and the main acoustic model to obtain the multilingual acoustic model.
  - 14. The computer-implemented method according to claim 1, wherein the criteria set is a distance measurement determined based on the Mahalanobis distance between first and second probability distribution functions.
  - 16. The computer implemented method according to claim 1, wherein the main acoustic model is modified by modifying the main acoustic model before combining it with the at least one modified second acoustic model to obtain the multilingual acoustic model, wherein the step of modifying the main acoustic model comprises adding at least one of the second probability distribution functions of the second acoustic model of the at least one second acoustic model to the main acoustic model.
  - 17. The computer implemented method according to claim 16, wherein a sub-set of the second probability distribution functions of the second acoustic model is added to the main acoustic model based on distances between the second and the first probability distribution functions.
  - 18. The computer implemented method according to claim 17, wherein the distances between the second and the first probability distribution functions are determined and at least one of the second probability distribution functions is added to the main acoustic model that exhibits a predetermined distance from one of the first probability distribution functions that is closest to this at least one of the second probability distribution functions.
  - 37. An electronic device that includes a multilingual acoustic model generated according to the method of claim 1.
  - 39. A speech recognition system that includes a multilingual acoustic model generated according to the method of claim 1.

5. A computer-implemented method for generating a speech recognizer comprising a multilingual acoustic model, comprising:
- providing to a processor from memory a main acoustic model including a set of probability distribution functions and a probabilistic state sequence model;
  
  providing to the processor from memory at least one second acoustic model including a set of probability distribution functions and a probabilistic state sequence model;
  
  in a computer process, determining mean vectors of states for the main acoustic model;
  
  in a computer process, determining new probabilistic state sequence models for the main acoustic model based on the determined mean vectors of states;
  
  in a computer process, replacing the second probabilistic state sequence model of the at least one second acoustic model by the closest new probabilistic state sequence model of the main acoustic model to obtain at least one modified second acoustic model; and
  
  in a computer process combining the main acoustic model and the at least one modified second acoustic model to obtain the multilingual acoustic model.
- View Dependent Claims (6, 7, 8, 9, 11, 12, 13, 15, 38, 40)
- - 6. A computer-implemented method for generating a multilingual acoustic model according to claim 5 wherein the criteria set is a distance measurement.
  - 7. A computer-implemented method according to claim 5, wherein the probability distribution function is a Gaussian distribution.
  - 8. A computer implemented method according to claim 5, wherein the probabilistic state sequence model is a Hidden Markov Model.
  - 9. The computer-implemented method according to claim 5, wherein the at least one modified second acoustic model is obtained by:
    - in a computer process replacing each of the probability distribution functions of the at least one second acoustic model by the respective closest one of the probability distribution functions and/or each of the states of the second probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the probabilistic state sequence model of the main acoustic model to obtain the at least one modified second acoustic model.
  - 11. The computer-implemented method according to claim 9, further comprising:
    - in a computer process, replacing the second probabilistic state sequence model of the at least one second acoustic model by the closest probabilistic state sequence model of the main acoustic model to obtain a first modified second acoustic model;
      
      in a computer process, replacing each of the second probability distribution functions of the at least one second acoustic model by the respective closest one of the first probability distribution functions or replacing each of the second states of the second probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the first probabilistic state sequence model of the main acoustic model to obtain a second modified second acoustic model;
      
      in a computer process, weighting the first modified second acoustic model by a first weight;
      
      in a computer process, weighting the second modified second acoustic model by a second weight; and
      
      in a computer process, combining the first modified second acoustic model weighted by the first weight and the second modified second acoustic model weighted by the second weight and the main acoustic model to obtain the multilingual acoustic model.
  - 12. The computer implemented method according to claim 11, wherein the first weight is chosen between 0.4 and 0.6 and the second weight is between 0.4 and 0.6.
  - 13. The computer implemented method according to claim 9, further comprising:
    - in a computer process, replacing the second probabilistic state sequence model of the at least one second acoustic model by the closest probabilistic state sequence model of the main acoustic model to obtain a first modified second acoustic model;
      
      in a computer process, replacing each of the second probability distribution functions of the at least one second acoustic model by the respective closest one of the first probability distribution functions to obtain a second modified second acoustic model;
      
      in a computer process, replacing each of the second states of the second probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the first probabilistic state sequence model of the main acoustic model to obtain a third modified second acoustic model;
      
      in a computer process, weighting the first modified second acoustic model by a first weight;
      
      in a computer process, weighting the second modified second acoustic model by a second weight;
      
      in a computer process, weighting the third modified second acoustic model by a third weight; and
      
      in a computer process, combining the first modified second acoustic model weighted by the first weight, the second modified second acoustic model weighted by the second weight, the third modified second acoustic model weighted by the third weight and the main acoustic model to obtain the multilingual acoustic model.
  - 15. The computer implemented method according to claims 6, wherein the criteria set is a distance measurement that is determined based on euclidean distances between first states of the first probabilistic state sequence model of the main acoustic model and second states of the second probabilistic state sequence model of the second acoustic model.
  - 38. An electronic device that includes a multilingual acoustic model generated according to the method of claim 5.
  - 40. A speech recognition system that includes a multilingual acoustic model generated according to the method of claim 5.

19. A computer program product comprising a tangible computer readable medium having executable computer code thereon for generating a multilingual acoustic model, the computer code comprising:
- computer code for retrieving a main acoustic model including a plurality of probability distribution functions and a probabilistic state sequence model that includes first states;
  
  computer code for retrieving least one second acoustic model including a plurality of second probability distribution functions and a second probabilistic state sequence model that includes states;
  
  computer code for replacing each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions of the main acoustic model and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with a state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model; and
  
  computer code for combining the main acoustic model and the at least one modified second acoustic model to obtain the multilingual acoustic model.
- View Dependent Claims (20, 21, 22)
- - 20. A computer program product according to claim 19 wherein the criteria set is a distance measurement.
  - 21. A computer program product according to claim 19, wherein the probability distribution function is a Gaussian distribution.
  - 22. A computer program product according to claim 19, wherein the probabilistic state sequence model is a Hidden Markov Model.

23. A computer program product comprising a tangible computer readable medium having executable computer code thereon for generating a speech recognizer comprising a multilingual acoustic model, the computer code comprising:
- computer code for retrieving a main acoustic model including first probability distribution functions and a probabilistic state sequence model having states;
  
  computer code for retrieving at least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states;
  
  computer code for determining mean vectors of states for the states of the probabilistic state sequence model of the main acoustic model;
  
  computer code for determining new probabilistic state sequence models for the main acoustic model based on the determined mean vectors of states;
  
  computer code for replacing the second probabilistic state sequence model of the at least one second acoustic model by a state sequence model of the main acoustic model based upon a criteria set to obtain at least one modified second acoustic model; and
  
  computer code for combining the main acoustic model and the at least one modified second acoustic model to obtain the multilingual acoustic model.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
- - 24. A computer program product according to claim 23 wherein the criteria set is a distance measurement.
  - 25. A computer program product according to claim 23, wherein the probability distribution function is a Gaussian distribution.
  - 26. A computer program product according to claim 23, wherein the probabilistic state sequence model is a Hidden Markov Model.
  - 27. The computer program product according to claim 23, wherein the at least one modified second acoustic model is obtained by:
    - computer code for replacing each of the probability distribution functions of the at least one second acoustic model by the respective closest one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the first probabilistic state sequence model of the main acoustic model to obtain the at least one modified second acoustic model.
  - 28. The computer program product according to claim 23, further comprising:
    - computer code for replacing each of the probability distribution functions of the at least one second acoustic model by the respective closest one of the first probability distribution functions to obtain a first modified second acoustic model;
      
      computer code for replacing each of the states of the probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the probabilistic state sequence model of the main acoustic model to obtain a second modified second acoustic model;
      
      computer code for weighting the first modified second acoustic model by a first weight;
      
      computer code for weighting the second modified second acoustic model by a second weight; and
      
      computer code for combining the first modified second acoustic model weighted by the first weight and the second modified second acoustic model weighted by the second weight and the main acoustic model to obtain the multilingual acoustic model.
  - 29. The computer program product according to claim 28, further comprising:
    - computer code for replacing the probabilistic state sequence model of the at least one second acoustic model by the closest probabilistic state sequence model of the main acoustic model to obtain a first modified second acoustic model;
      
      computer code for replacing each of the probability distribution functions of the at least one second acoustic model by the respective closest one of the probability distribution functions or replacing each of the states of the probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the probabilistic state sequence model of the main acoustic model to obtain a second modified second acoustic model;
      
      computer code for weighting the first modified second acoustic model by a first weight;
      
      computer code for weighting the second modified second acoustic model by a second weight; and
      
      computer code for combining the first modified second acoustic model weighted by the first weight and the second modified second acoustic model weighted by the second weight and the main acoustic model to obtain the multilingual acoustic model.
  - 30. The computer program product according to claim 29, wherein the first weight is chosen between 0.4 and 0.6 and the second weight is between 0.4 and 0.6.
  - 31. The computer program product according to claim 27, further comprising:
    - computer code for replacing the probabilistic state sequence model of the at least one second acoustic model by the closest probabilistic state sequence model of the main acoustic model to obtain a first modified second acoustic model;
      
      computer code for replacing each of the probability distribution functions of the at least one second acoustic model by the respective closest one of the probability distribution functions from the main acoustic model to obtain a second modified second acoustic model;
      
      computer code for replacing each of the states of the probabilistic state sequence model of the at least one second acoustic model with the respective closest state of the probabilistic state sequence model of the main acoustic model to obtain a third modified second acoustic model;
      
      computer code for weighting the first modified second acoustic model by a first weight;
      
      computer code for weighting the second modified second acoustic model by a second weight;
      
      computer code for weighting the third modified second acoustic model by a third weight; and
      
      computer code for combining the first modified second acoustic model weighted by the first weight, the second modified second acoustic model weighted by the second weight, the third modified second acoustic model weighted by the third weight and the main acoustic model to obtain the multilingual acoustic model.
  - 32. The computer program product according to claim 23, wherein the respective probability distribution functions are determined based on the mahalanobis distance between the probability distribution functions from the main and second acoustic models.
  - 33. The computer program product according to claims 23, wherein the criteria set for selecting the probabilistic state sequence model of the main acoustic model is determined based on euclidean distances between states of the probabilistic state sequence model of the main acoustic model and states of the probabilistic state sequence model of the second acoustic model.
  - 34. The computer program product according to claim 33, wherein the main acoustic model is modified by modifying the main acoustic model before combining it with the at least one modified second acoustic model to obtain the multilingual acoustic model, wherein the step of modifying the main acoustic model comprises adding at least one of the probability distribution functions of the second acoustic model to the main acoustic model.
  - 35. The computer program product according to claim 34, wherein a sub-set of the probability distribution functions of the second acoustic model is added to the main acoustic model based on distances between the probability distribution functions.
  - 36. The computer program product according to claim 35, wherein the distances between the probability distribution functions from the main and second acoustic models are determined and at least one of the probability distribution functions of the second acoustic model is added to the main acoustic model that exhibits a predetermined distance from one of the probability distribution functions of the main acoustic model that is closest to this at least one of the probability distribution functions of the second acoustic model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Brueckner, Raymond, Gruhn, Rainer, Raab, Martin

Granted Patent

US 8,301,445 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/8
CPC Class Codes

G10L 15/144 Training of HMMs

G10L 15/187 Phonemic context, e.g. pron...

Speech Recognition Based on a Multilingual Acoustic Model

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

30 Citations

40 Claims

Specification

Solutions

Use Cases

Quick Links

Speech Recognition Based on a Multilingual Acoustic Model

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

40 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links