Method and apparatus for utterance verification
First Claim
1. A method for utterance verification adapted to verify a recognized vocabulary, wherein the recognized vocabulary is obtained by performing speech recognition on a feature vector sequence according to an acoustic model and model vocabulary database, wherein the feature vector sequence comprises feature vectors of a plurality of frames, wherein the acoustic model and model vocabulary database comprises a plurality of model vocabularies, wherein each of the model vocabularies comprises a plurality of states, and wherein the method for utterance verification comprises:
- calculating a maximum reference score for each of the model vocabularies according to a log-likelihood score obtained from speech recognition, wherein the log-likelihood score obtained from speech recognition is calculated by taking a logarithm on a value of a probability of one of the feature vectors of the frames conditioned on one of the states of each model vocabulary, and wherein the maximum reference score is a summation of the maximum value of log-likelihood scores of the feature vector of each frame conditioned on each state of a certain model vocabulary;
calculating a first verification score according to an optimal path score output during the speech recognition and the maximum reference score; and
comparing the first verification score with a first predetermined threshold value, so as to reject or accept the recognized vocabulary.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for utterance verification are provided for verifying a recognized vocabulary output from speech recognition. The apparatus for utterance verification includes a reference score accumulator, a verification score generator and a decision device. A log-likelihood score obtained from speech recognition is processed by taking a logarithm of the value of the probability of one of feature vectors of an input speech conditioned on one of states of each model vocabulary. A verification score is generated based on the processed result. The verification score is compared with a predetermined threshold value so as to reject or accept the recognized vocabulary.
-
Citations
16 Claims
-
1. A method for utterance verification adapted to verify a recognized vocabulary, wherein the recognized vocabulary is obtained by performing speech recognition on a feature vector sequence according to an acoustic model and model vocabulary database, wherein the feature vector sequence comprises feature vectors of a plurality of frames, wherein the acoustic model and model vocabulary database comprises a plurality of model vocabularies, wherein each of the model vocabularies comprises a plurality of states, and wherein the method for utterance verification comprises:
-
calculating a maximum reference score for each of the model vocabularies according to a log-likelihood score obtained from speech recognition, wherein the log-likelihood score obtained from speech recognition is calculated by taking a logarithm on a value of a probability of one of the feature vectors of the frames conditioned on one of the states of each model vocabulary, and wherein the maximum reference score is a summation of the maximum value of log-likelihood scores of the feature vector of each frame conditioned on each state of a certain model vocabulary; calculating a first verification score according to an optimal path score output during the speech recognition and the maximum reference score; and comparing the first verification score with a first predetermined threshold value, so as to reject or accept the recognized vocabulary. - View Dependent Claims (2)
-
-
3. A method for utterance verification adapted to verify a recognized vocabulary, wherein the recognized vocabulary is obtained by performing speech recognition on a feature vector sequence according to an acoustic model and model vocabulary database, wherein the feature vector sequence comprises feature vectors of a plurality of frames, wherein the acoustic model and model vocabulary database comprises a plurality of model vocabularies, wherein each of the model vocabularies comprises a plurality of states, and wherein the method for utterance verification comprises:
-
calculating an overall maximum reference score according to a log-likelihood score obtained from speech recognition, wherein the log-likelihood score obtained from speech recognition is calculated by taking a logarithm on a value of a probability of one of the feature vectors of the frames conditioned on one of the states of each model vocabulary, and wherein the overall maximum reference score is a summation of the maximum value of log-likelihood scores of the feature vector of each frame conditioned on each state of each of the model vocabularies; calculating a second verification score according to an optimal path score output during the speech recognition and the overall maximum reference score; and comparing the second verification score with a second predetermined threshold value, so as to reject or accept the recognized vocabulary. - View Dependent Claims (4, 5, 6, 7, 8)
-
-
9. An apparatus for utterance verification adapted to verify a recognized vocabulary output by a speech recognition device, wherein the recognized vocabulary is obtained by performing speech recognition on a feature vector sequence according to an acoustic model and model vocabulary database, wherein the feature vector sequence comprises feature vectors of a plurality of frames, wherein the acoustic model and model vocabulary database comprises a plurality of model vocabularies, wherein each of the model vocabularies comprises a plurality of states, and wherein the apparatus for utterance verification comprises:
-
a reference score accumulator coupled to the speech recognition device and adapted to calculate a maximum reference score for each of the model vocabularies according to a log-likelihood score obtained from the speech recognition device by taking a logarithm on a value of a probability of one of the feature vectors of the frames conditioned on one of the states of each model vocabulary, wherein the maximum reference score is a summation of the maximum value of log-likelihood scores of the feature vector of each frame conditioned on each state of a certain model vocabulary; a verification score generator coupled to the reference score accumulator and adapted to calculate a first verification score according to an optimal path score output from the speech recognition device and the maximum reference score; and a decision device coupled to the verification score generator and adapted to compare the first verification score with a first predetermined threshold value, so as to reject or accept the recognized vocabulary. - View Dependent Claims (10)
-
-
11. An apparatus for utterance verification adapted to verify a recognized vocabulary output by a speech recognition device, wherein the recognized vocabulary is obtained by performing speech recognition on a feature vector sequence according to an acoustic model and model vocabulary database, wherein the feature vector sequence comprises feature vectors of a plurality of frames, wherein the acoustic model and model vocabulary database comprises a plurality of model vocabularies, wherein each of the model vocabularies comprises a plurality of states, and wherein the apparatus for utterance verification comprises:
-
a reference score accumulator coupled to the speech recognition device and adapted to calculate an overall maximum reference score according to a log-likelihood score obtained from the speech recognition device by taking a logarithm on a value of a probability of one of the feature vectors of the frames conditioned on one of the states of each model vocabulary, wherein the overall maximum reference score is a summation of the maximum value of log-likelihood scores of the feature vector of each frame conditioned on each state of each of the model vocabularies; a decision device coupled to the reference score accumulator and adapted to calculate a second verification score according to an optimal path score output by the speech recognition device and the overall maximum reference score; a verification score generator coupled to the reference score accumulator and adapted to compare the second verification score with a second predetermined threshold value, so as to reject or accept the recognized vocabulary. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification