Method and System for Selectively Biased Linear Discriminant Analysis in Automatic Speech Recognition Systems
First Claim
1. A method for training an acoustic model using the maximum likelihood criteria, comprising the steps of:
- a. performing a forced alignment of speech training data;
b. processing the training data and obtaining estimated scatter matrices, wherein said scatter matrices may comprise one or more of a between class scatter matrix and a within-class scatter matrix, from which mean vectors may be estimated;
c. biasing the between class scatter matrix and the within-class scatter matrix;
d. diagonalizing the between class scatter matrix and the within class scatter matrix and estimating eigen-vectors to produce transformed scatter matrices;
e. obtaining new discriminative features using the estimated vectors, wherein said vectors correspond to the highest discrimination in the new space;
f. training a new acoustic model based on said new discriminative features; and
g. saving said acoustic model.
6 Assignments
0 Petitions
Accused Products
Abstract
A system and method are presented for selectively biased linear discriminant analysis in automatic speech recognition systems. Linear Discriminant Analysis (LDA) may be used to improve the discrimination between the hidden Markov model (HMM) tied-states in the acoustic feature space. The between-class and within-class covariance matrices may be biased based on the observed recognition errors of the tied-states, such as shared HMM states of the context dependent tri-phone acoustic model. The recognition errors may be obtained from a trained maximum-likelihood acoustic model utilizing the tied-states which may then be used as classes in the analysis.
-
Citations
32 Claims
-
1. A method for training an acoustic model using the maximum likelihood criteria, comprising the steps of:
-
a. performing a forced alignment of speech training data; b. processing the training data and obtaining estimated scatter matrices, wherein said scatter matrices may comprise one or more of a between class scatter matrix and a within-class scatter matrix, from which mean vectors may be estimated; c. biasing the between class scatter matrix and the within-class scatter matrix; d. diagonalizing the between class scatter matrix and the within class scatter matrix and estimating eigen-vectors to produce transformed scatter matrices; e. obtaining new discriminative features using the estimated vectors, wherein said vectors correspond to the highest discrimination in the new space; f. training a new acoustic model based on said new discriminative features; and g. saving said acoustic model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for training an acoustic model, comprising the steps of:
-
a. performing a forced alignment of speech training data; b. performing recognition on said training data and estimating error rates of each tied-state triphone; c. processing the training data and obtaining one or more of an estimated scatter matrix from which a mean vector may be estimated; d. biasing the one or more of an estimated scatter matrix; e. performing diagonalization on one or more of an estimated scatter matrix and estimating a vector to produce one or more transformed scatter matrix; f. obtaining new discriminative features using the transformed one or more of an estimated scatter matrix as a linear transformation of a vector; g. training a new acoustic model; and h. saving said acoustic model. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. A system for training an acoustic model comprising:
-
a. means for performing a forced alignment of speech training data; b. means for processing the training data and obtaining estimated scatter matrices, which may comprise one or more of a between class scatter matrix and a within-class scatter matrix, from which mean vectors may be estimated; c. means for biasing the between class scatter matrix and the within-class scatter matrix; d. means for diagonalizing the between class scatter matrix and the within class scatter matrix and estimating eigen-vectors to produce transformed scatter matrices; e. means for obtaining new discriminative features using the transformed scatter matrices as a linear transformation of a super vector; f. means for training a new acoustic model; and g. means for saving said acoustic model.
-
Specification