Speech recognition using discriminant features
First Claim
Patent Images
1. A method of facilitating speech recognition, said method comprising the steps of:
- obtaining speech input data;
building a model for each feature of an original set of linguistic features, wherein the model reflects whether or not each feature is present;
ranking the linguistic features;
rebuilding the model for each of a preselected number N of the ranked linguistic features; and
compiling a confusion matrix for each feature of the original set of features subsequent to said step of building a model for each feature of an original set of features, wherein said compiling a confusion matrix comprises;
computing a score for each feature based on the likelihood of its presence in a frame of the speech input data, andcalculating mutual information between truth and labels for each feature;
wherein the ranking comprises ranking the mutual information calculated in compiling the confusion matrix.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and arrangements for representing the speech waveform in terms of a set of abstract, linguistic distinctions in order to derive a set of discriminative features for use in a speech recognizer. By combining the distinctive feature representation with an original waveform representation, it is possible to achieve a reduction in word error rate of 33% on an automatic speech recognition task.
14 Citations
17 Claims
-
1. A method of facilitating speech recognition, said method comprising the steps of:
-
obtaining speech input data; building a model for each feature of an original set of linguistic features, wherein the model reflects whether or not each feature is present; ranking the linguistic features; rebuilding the model for each of a preselected number N of the ranked linguistic features; and compiling a confusion matrix for each feature of the original set of features subsequent to said step of building a model for each feature of an original set of features, wherein said compiling a confusion matrix comprises; computing a score for each feature based on the likelihood of its presence in a frame of the speech input data, and calculating mutual information between truth and labels for each feature; wherein the ranking comprises ranking the mutual information calculated in compiling the confusion matrix. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for facilitating speech recognition, said method tomprising the steps of:
-
an input medium which obtains speech input data; a first model builder which builds a model for each feature of an orininal set of linguistic features, wherein the model reflects whether or not each feature is present; a ranking arrangement which ranks the linguistic features; a second model builder which rebuilds the model for each of a preselected number N of the ranked linguistic features; and a matrix compiler which compiles a confusion matrix for each feature of the original set of features subsequent to said step of building a model for each feature of an original set of features, wherein said matrix compiler is adapted to; compute a score for each feature based on the likelihood of its presence in a frame of the speech input data, and calculate mutual information between truth and labels for each feature; wherein said ranking arrangement is adapted to rank the mutual information calculated in compiling the confusion matrix. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A program storage device readable by computer, tangibly embodying a program of instructions executable by the computer to perform method steps for speech recognition, said method comprising the steps of:
-
obtaining speech input data; building a model for each feature of an original set of linguistic features, wherein the model reflects whether or not each feature is present; ranking the linguistic features; rebuilding the model for each of a preselected number N of the ranked linguistic features; and compiling a confusion matrix for each feature of the original set of features subsequent to said step of building a model for each feature of an original set of features, wherein said compiling a confusion matrix comprises; computing a score for each feature based on the likelihood of its presence in a frame of the speech input data, and calculating mutual information between truth and labels for each feature; wherein the ranking comprises ranking the mutual information calculated in compiling the confusion matrix.
-
Specification