Unsupervised incremental adaptation using maximum likelihood spectral transformation
First Claim
1. A method of transforming speech feature vectors associated with speech data provided to a speech recognition system, the method comprising the steps of:
- receiving likelihood of utterance information corresponding to a previous feature vector transformation;
estimating one or more transformation parameters as a function of the likelihood of utterance information corresponding to a previous feature vector transformation; and
transforming a current feature vector based on at least one of maximum likelihood criteria and the estimated one or more transformation parameters, the transformation being performed in a linear spectral domain;
wherein the step of estimating the one or more transformation parameters comprises the step of estimating convolutional noise Niα
and additive noise Niβ
for each ith component of a speech vector corresponding to the speech data provided to the speech recognition system.
1 Assignment
0 Petitions
Accused Products
Abstract
In a speech recognition system, a method of transforming speech feature vectors associated with speech data provided to the speech recognition system includes the steps of receiving likelihood of utterance information corresponding to a previous feature vector transformation, estimating one or more transformation parameters based, at least in part, on the likelihood of utterance information corresponding to a previous feature vector transformation, and transforming a current feature vector based on maximum likelihood criteria and/or the estimated transformation parameters, the transformation being performed in a linear spectral domain. The step of estimating the one or more transformation parameters includes the step of estimating convolutional noise Niα and additive noise Niβ for each ith component of a speech vector corresponding to the speech data provided to the speech recognition system.
25 Citations
15 Claims
-
1. A method of transforming speech feature vectors associated with speech data provided to a speech recognition system, the method comprising the steps of:
-
receiving likelihood of utterance information corresponding to a previous feature vector transformation; estimating one or more transformation parameters as a function of the likelihood of utterance information corresponding to a previous feature vector transformation; and transforming a current feature vector based on at least one of maximum likelihood criteria and the estimated one or more transformation parameters, the transformation being performed in a linear spectral domain; wherein the step of estimating the one or more transformation parameters comprises the step of estimating convolutional noise Niα
and additive noise Niβ
for each ith component of a speech vector corresponding to the speech data provided to the speech recognition system. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An article of manufacture for transforming speech feature vectors associated with speech data provided to a speech recognition system, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
receiving likelihood of utterance information corresponding to a previous feature vector transformation; estimating one or more transformation parameters as a function of the likelihood of utterance information corresponding to a previous feature vector transformation; and transforming a current feature vector based on at least one of maximum likelihood criteria and the estimated transformation parameters, the transformation being performed in a linear spectral domain; wherein the step of estimating the one or more transformation parameters comprises the step of estimating convolutional noise Niα
and additive noise Niβ
for each ith component of a speech vector corresponding to the speech data provided to the speech recognition system. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. Apparatus for transforming speech feature vectors associated with speech data provided to a speech recognition system, the apparatus comprising:
at least one processing device operative;
(i) to receive likelihood of utterance information corresponding to a previous feature vector transformation;
(ii) to estimate one or more transformation parameters based, at least in part, on the likelihood of utterance information corresponding to a previous feature vector transformation;
(iii) to transform a current feature vector based on at least one of maximum likelihood criteria and the estimated transformation parameters, the transformation being performed in a linear spectral domain; and
(iv) to estimate convolutional noise Niα
and additive noise Niβ
for each ith component of a speech vector corresponding to the speech data provided to the speech recognition system.
Specification