Speech recognition method using speaker cluster models
First Claim
Patent Images
1. A speech recognition method comprising:
- receiving a speech signal;
recognizing the speech signal using a speaker cluster model obtained in a training phase wherein the speaker cluster model is a collection of a plurality of cluster-dependent models, and a score of each candidate is calculated according to a score function which is defined by taking the dependency among the cluster-dependent models into account; and
obtaining a final recognition result according to a decision rule based on the Score of each candidate, wherein the training phase comprises building an initialization model, and adjusting parameters of at least two cluster-dependent models of the initialization model by using a discriminative training method to obtain the speaker cluster model wherein the discriminative training method is implemented by using a minimum classification error as a training criterion, a discriminant function of the discriminative training method being defined in the same manner as the score function, and the score function is defined as;
wherein gi(X;
Ã
) is the score function, X is a feature vector sequence of the speech signal, Ã
represents an entire parameter set of the speaker cluster model, N is the number of cluster-dependent models, parameter sets corresponding to the N cluster-dependent models are Ë
1, Ë
2, . . . , Ë
N, M is the number of candidates to be classified, hi(X;
Ë
n) is a log-likelihood function defined only on a parameter set Ë
n, î
is a positive weighting number, and wn(X) is a cluster weighting function that indicates the degree to which the nth cluster-dependent model is used for recognition.
0 Assignments
0 Petitions
Accused Products
Abstract
In speaker-independent speech recognition, between-speaker variability is one of the major resources of recognition errors. A speaker cluster model is used to manage recognition problems caused by between-speaker variability. In the training phase, the score function is used as a discriminative function. The parameters of at least two cluster-dependent models are adjusted through a discriminative training method to improve performance of the speech recognition.
-
Citations
4 Claims
-
1. A speech recognition method comprising:
-
receiving a speech signal;
recognizing the speech signal using a speaker cluster model obtained in a training phase wherein the speaker cluster model is a collection of a plurality of cluster-dependent models, and a score of each candidate is calculated according to a score function which is defined by taking the dependency among the cluster-dependent models into account; and
obtaining a final recognition result according to a decision rule based on the Score of each candidate, wherein the training phase comprises building an initialization model, and adjusting parameters of at least two cluster-dependent models of the initialization model by using a discriminative training method to obtain the speaker cluster model wherein the discriminative training method is implemented by using a minimum classification error as a training criterion, a discriminant function of the discriminative training method being defined in the same manner as the score function, and the score function is defined as;
wherein gi(X;
Ã
) is the score function, X is a feature vector sequence of the speech signal, Ã
represents an entire parameter set of the speaker cluster model, N is the number of cluster-dependent models, parameter sets corresponding to the N cluster-dependent models are Ë
1, Ë
2, . . . , Ë
N, M is the number of candidates to be classified, hi(X;
Ë
n) is a log-likelihood function defined only on a parameter set Ë
n, î
is a positive weighting number, and wn(X) is a cluster weighting function that indicates the degree to which the nth cluster-dependent model is used for recognition.- View Dependent Claims (2, 3, 4)
-
Specification