Adaptation of compressed acoustic models

US 7,499,857 B2
Filed: 05/15/2003
Issued: 03/03/2009
Est. Priority Date: 05/15/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A method of adapting an acoustic model for use in a speech recognition engine, comprising:

subspace coding the acoustic model by a computer to obtain a plurality of codebooks each including a plurality of codewords, the plurality of codebooks including at least one codebook per subspace.adapting the codewords in the codebooks based on adaptation training data, by applying an adaptation transform to the codewords, regardless of whether the acoustic model is recomputed based on the adaptation training data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention is used to adapt acoustic models, quantized in subspaces, using adaptation training data (such as speaker-dependent training data). The acoustic model is compressed into multi-dimensional subspaces. A codebook is generated for each subspace. An adaptation transform is estimated, and it is applied to codewords in the codebooks, rather than to the means themselves.

24 Citations

View as Search Results

20 Claims

1. A method of adapting an acoustic model for use in a speech recognition engine, comprising:
- subspace coding the acoustic model by a computer to obtain a plurality of codebooks each including a plurality of codewords, the plurality of codebooks including at least one codebook per subspace.adapting the codewords in the codebooks based on adaptation training data, by applying an adaptation transform to the codewords, regardless of whether the acoustic model is recomputed based on the adaptation training data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 and further comprising:
    - prior to adapting, dividing the codewords in each codebook into a plurality of different classes.
  - 3. The method of claim 2 wherein adapting comprises:
    - applying the adaptation transform to the codewords based on which of the plurality of classes the codewords belong to.
  - 4. The method of claim 3 wherein dividing the codewords comprises:
    - building a regression tree corresponding to each codebook; and
      
      grouping the codewords in a given codebook into one of a plurality of regression classes by traversing a regression tree corresponding to the given codebook.
  - 5. The method of claim 4 wherein building a regression tree comprises:
    - building a linguistic regression tree.
  - 6. The method of claim 4 wherein building a regression tree comprises:
    - building a regression tree by iterative clustering of the codewords.
  - 7. The method of claim 3 wherein applying an adaptation transform comprises:
    - estimating an adaptation transform corresponding to each of the plurality of classes.
  - 8. The method of claim 1 wherein each codeword represents at least one Gaussian mean and wherein adapting a codeword includes adapting the Gaussian mean.
  - 9. The method of claim 8 wherein each codeword represents at least one Gaussian variance and wherein adapting a codeword includes adapting the Gaussian vanance.
  - 10. The method of claim 1 wherein adapting comprises:
    - adapting the codewords based on speaker-dependent adaptation training data.

11. A computer implemented method of training an acoustic model in a speech recognizer, comprising:
- generating by the computer a subspace coded acoustic model having a plurality of codebooks, one codebook corresponding to each acoustic subspace into which the acoustic model is coded, each codebook having a plurality of codewords therein, each codeword representing at least one component of an acoustic characteristic of a modeled speech unitmodifying the codewords based on adaptation training data without recomputing the acoustic model based on the adaptation training data.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
- - 12. The computer implemented method of claim 11 wherein modifying comprises:
    - receiving the adaptation training data; and
      
      estimating a transform based on the adaptation training data.
  - 13. The computer implemented method of claim 12 wherein modifying comprises:
    - grouping the codewords in each codebook into one of a plurality of classes.
  - 14. The computer implemented method of claim 13 wherein estimating a transform comprises:
    - estimating a transform for each of the plurality of classes.
  - 15. The computer implemented method of claim 14 wherein grouping comprises:
    - building a regression tree corresponding to each codebook.
  - 16. The computer implemented method of claim 15 wherein grouping comprises:
    - traversing the regression tree to group the codewords in the corresponding codebook into regression classes.
  - 17. The computer implemented method of claim 16 wherein estimating a transform comprises:
    - estimating a transform for each regression class.
  - 18. The computer implemented method of claim 17 wherein modifying the codewords comprises:
    - applying a transform to a given codeword, the transform corresponding to a regression class in which the given codeword resides.

19. A computer storage medium storing instructions which, when executed, cause a computer to perform steps of:
- receiving a subspace coded acoustic model including a codebook corresponding to each subspace and a plurality of codewords in each codebook;
  
  receiving training data; and
  
  adapting the codewords in the codebooks based on the training data, by grouping the codewords in each codebook into classes, and adapting the codewords differently depending on a class to which the codewords belong.
- View Dependent Claims (20)
- - 20. The computer storage medium of claim 19 wherein grouping the codewords comprises:
    - obtaining a regression tree for each codebook; and
      
      traversing the regression tree to divide the codewords in each codebook into regression classes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Gunawardana, Asela J.
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
Sked; Matthew J

Application Number

US10/438,498
Publication Number

US 20040230424A1
Time in Patent Office

2,119 Days
Field of Search

None
US Class Current

704/255
CPC Class Codes

G10L 15/07   to the speaker

G10L 15/144   Training of HMMs

G10L 15/285   Memory allocation or algori...

Adaptation of compressed acoustic models

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

24 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Adaptation of compressed acoustic models

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

24 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others