Method and apparatus for computing likelihood and hypothesizing keyword appearance in speech
First Claim
1. In a computerized speech recognition system, an improvement in a method for determining confidence of an occurrence of a keyword in a spoken utterance forming word sequences, the method including the steps of obtaining a time series of observation feature vectors representing the spoken utterance, said time series being formed from a representation of acoustic speech input, and determining possible word sequences and corresponding likelihood scores for each of said possible word sequences for said observations the improvement comprising:
- computing for an application a confidence score for said keyword from probabilities that said keyword is in a sequence of words given said observation feature vectors, wherein said confidence score is computed as a summation over the word sequences containing the keyword of the product of the likelihood of the word sequence and the likelihood of the observations given the word sequence;
comparing said confidence score to a threshold; and
declaring detection of said keyword in said spoken utterance if said confidence score exceeds said threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
In a method for determining likelihood of appearance of keywords in a spoken utterance as part of a keyword spotting system of a speech recognizer, a new scoring technique is provided wherein a confidence score is computed as a probability of observing the keyword in a sequence of words given the observations. The corresponding confidence scores are the probability of the keyword appearing in any word sequence given the observations. In a specific embodiment, the technique involves hypothesizing a keyword whenever it appears in any of the "N-Best" word lists with a confidence score that is computed by summing the likelihoods for all hypotheses that contain the keyword, normalized by dividing by the sum of all hypothesis likelihoods in the "N-best" list.
139 Citations
14 Claims
-
1. In a computerized speech recognition system, an improvement in a method for determining confidence of an occurrence of a keyword in a spoken utterance forming word sequences, the method including the steps of obtaining a time series of observation feature vectors representing the spoken utterance, said time series being formed from a representation of acoustic speech input, and determining possible word sequences and corresponding likelihood scores for each of said possible word sequences for said observations the improvement comprising:
-
computing for an application a confidence score for said keyword from probabilities that said keyword is in a sequence of words given said observation feature vectors, wherein said confidence score is computed as a summation over the word sequences containing the keyword of the product of the likelihood of the word sequence and the likelihood of the observations given the word sequence; comparing said confidence score to a threshold; and declaring detection of said keyword in said spoken utterance if said confidence score exceeds said threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computerized keyword spotting system comprising:
-
a word sequence search engine coupled to receive observations extracted from a speech signal, the search engine configured to produce at least one possible word sequence explaining said observations and a likelihood score corresponding to each of said possible word sequences; and a confidence score computer coupled to said word sequence search engine configured to generate a confidence score from probabilities that said keyword is in said sequences of words given said observations, wherein said confidence score computer produces said confidence score according to the expression;
##EQU5## where;
P (Obs|W) is the acoustic HMM probabilityP (W) is the language model probability W;
(KWε
W) is the list of word sequences that contain the keyword. - View Dependent Claims (13, 14)
-
Specification