Bubble splitting for compact acoustic modeling

US 7,328,154 B2
Filed: 08/13/2003
Issued: 02/05/2008
Est. Priority Date: 08/13/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A method for constructing acoustic models for use in a speech recognizer, comprising:

partitioning speech data from a plurality of training speakers according to at least one speech related criteria, wherein the step of partitioning speech data further comprises partitioning the speech data into male group data and female group data by labeling the speech data according to gender of the training speakers during training, and further partitioning the male group data by vocal tract length normalization factor for only the male group, and partitioning the female group data by vocal tract length normalization factor for only the female group;

grouping together the partitioned speech data from training speakers having similar speech characteristics, including gender and gender-specific vocal tract length normalization factor; and

training an acoustic bubble model for each group using the speech data within the group.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.

Citations

13 Claims

1. A method for constructing acoustic models for use in a speech recognizer, comprising:
- partitioning speech data from a plurality of training speakers according to at least one speech related criteria, wherein the step of partitioning speech data further comprises partitioning the speech data into male group data and female group data by labeling the speech data according to gender of the training speakers during training, and further partitioning the male group data by vocal tract length normalization factor for only the male group, and partitioning the female group data by vocal tract length normalization factor for only the female group;
  
  grouping together the partitioned speech data from training speakers having similar speech characteristics, including gender and gender-specific vocal tract length normalization factor; and
  
  training an acoustic bubble model for each group using the speech data within the group.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1 further comprises grouping together speech data for training speakers having a vocal tract length normalizing factor around one, grouping together speech data for training speakers having a vocal tract length normalizing factor less than one, and grouping together speech data for training speakers having a vocal tract length normalizing factor greater than one.
  - 3. The method of claim 1 wherein the step of grouping the partitioned speech data further comprises grouping the speech data such that speech data for a given speaker is placed in two or more groups of speech data.
  - 4. The method of claim 1 wherein the step of training an acoustic bubble model further comprises applying maximum likelihood estimation to each group of speech data.
  - 5. The method of claim 1 wherein the step of training an acoustic bubble model further comprises applying a maximum a posteriori (MAP) estimation to each group of speech data.
  - 6. The method of claim 1 wherein the step of training an acoustic bubble model further comprises applying maximum likelihood linear regression (MLLR) to each group of speech data.
  - 7. The method of claim 1 further comprises normalizing the acoustic bubble models, thereby yielding a set of compact acoustic bubble models.
  - 8. The method of claim 7 wherein the step of normalizing the acoustic bubble models further comprises performing speaker adaptive training on each of the acoustic bubble models.
  - 9. The method of claim 7 wherein the step of normalizing the acoustic bubble models further comprises performing inverse transform speaker adaptive training on each of the acoustic bubble models.
  - 10. The method of claim 7 wherein the step of normalizing the acoustic bubble models further comprises performing speaker-normalized training on each of the acoustic bubble models, including performing a training cycle comprising a normalization-training-accumulation phase storing accumulators that serve as input to a synchronization phase of the training cycle.
  - 11. The method of claim 7 wherein the step of normalizing the acoustic bubble models further comprises performing normalized speaker adaptive training on each of the acoustic bubble models in which a normalization step is added in both training and decoding procedures of sneaker adaptive training.
  - 12. The method of claim 1 further comprises:
    - receiving an unknown speech utterance;
      
      selecting an acoustic bubble model which most closely correlates to the unknown speech utterance; and
      
      decoding the unknown speech utterance using the selected acoustic bubble model.
  - 13. The method of claim 12 wherein the step of selecting an acoustic model further comprises selecting an acoustic bubble model using the speech related criteria used to partition the speech data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sovereign Peak Ventures, LLC (Dominion Harbor Enterprises, LLC)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Mutel, Ambroise, Rigazio, Luca, Nguyen, Patrick
Primary Examiner(s)
Hudspeth; David
Assistant Examiner(s)
Rider; Justin W.

Application Number

US10/639,974
Publication Number

US 20050038655A1
Time in Patent Office

1,637 Days
Field of Search

704/234, 704/245
US Class Current

704/245
CPC Class Codes

G10L 15/063   Training

G10L 15/144   Training of HMMs

G10L 2015/0631   Creating reference template...

G10L 2015/0638   Interactive procedures

Bubble splitting for compact acoustic modeling

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Bubble splitting for compact acoustic modeling

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links