Method and apparatus for computing likelihood and hypothesizing keyword appearance in speech

US 5,842,163 A
Filed: 06/07/1996
Issued: 11/24/1998
Est. Priority Date: 06/21/1995
Status: Expired due to Term

First Claim

Patent Images

1. In a computerized speech recognition system, an improvement in a method for determining confidence of an occurrence of a keyword in a spoken utterance forming word sequences, the method including the steps of obtaining a time series of observation feature vectors representing the spoken utterance, said time series being formed from a representation of acoustic speech input, and determining possible word sequences and corresponding likelihood scores for each of said possible word sequences for said observations the improvement comprising:

computing for an application a confidence score for said keyword from probabilities that said keyword is in a sequence of words given said observation feature vectors, wherein said confidence score is computed as a summation over the word sequences containing the keyword of the product of the likelihood of the word sequence and the likelihood of the observations given the word sequence;

comparing said confidence score to a threshold; and

declaring detection of said keyword in said spoken utterance if said confidence score exceeds said threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a method for determining likelihood of appearance of keywords in a spoken utterance as part of a keyword spotting system of a speech recognizer, a new scoring technique is provided wherein a confidence score is computed as a probability of observing the keyword in a sequence of words given the observations. The corresponding confidence scores are the probability of the keyword appearing in any word sequence given the observations. In a specific embodiment, the technique involves hypothesizing a keyword whenever it appears in any of the "N-Best" word lists with a confidence score that is computed by summing the likelihoods for all hypotheses that contain the keyword, normalized by dividing by the sum of all hypothesis likelihoods in the "N-best" list.

139 Citations

14 Claims

1. In a computerized speech recognition system, an improvement in a method for determining confidence of an occurrence of a keyword in a spoken utterance forming word sequences, the method including the steps of obtaining a time series of observation feature vectors representing the spoken utterance, said time series being formed from a representation of acoustic speech input, and determining possible word sequences and corresponding likelihood scores for each of said possible word sequences for said observations the improvement comprising:
- computing for an application a confidence score for said keyword from probabilities that said keyword is in a sequence of words given said observation feature vectors, wherein said confidence score is computed as a summation over the word sequences containing the keyword of the product of the likelihood of the word sequence and the likelihood of the observations given the word sequence;
  
  comparing said confidence score to a threshold; and
  
  declaring detection of said keyword in said spoken utterance if said confidence score exceeds said threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. In the method used in the speech recognition system according to claim 1 further comprising using all said corresponding likelihood scores for normalizing the confidence score of said keyword.
  - 3. In the method used in the speech recognition system according to claim 1 further comprising hypothesizing said keyword upon appearance of said keyword in any of said possible word sequences.
  - 4. In the method used in the speech recognition system according to claim 1, further comprising adding an .di-elect cons. multiplied by the acoustic likelihood per frame of said keyword, to provide a mechanism for breaking ties in confidence score.
  - 5. In the method used in the speech recognition according to claim 1, wherein said confidence score is computed according to the expression:
    - ##EQU4## where;
      
      P (Obs|W) is the acoustic HMM probabilityP (W) is the language model probabilityW;
      
      (KWε
      
      NB_i) is the list of all N-Best word sequences that contain the keyword.
  - 6. In the method used in the speech recognition system according to claim 1 further comprising using all said corresponding likelihood scores for normalizing the confidence score of said keyword.
  - 7. In the method used in the speech recognition system according to claim 6 further comprising using said corresponding likelihood scores from sequences not containing said keyword for normalizing the confidence of occurrence of said keyword.
  - 8. In the method used in the speech recognition system according to claim 1 wherein each hypothesis contains timing information to allow identification of multiple occurrences of said keyword in said sequences, further comprising treating each occurrence of the same keyword in a single recognition hypotheses as a separate instance of a keyword hypothesis.
  - 9. In the method used in the speech recognition system according to claim 1 wherein each hypothesis contains timing information to allow identification of multiple occurrences of said keyword in said sequences, further comprising treating each occurrence of the same keyword in a single recognition hypotheses as a separate instance of a keyword hypothesis.
  - 10. In the method used in the speech recognition system according to claim 1 wherein each hypothesis contains timing information to allow identification of multiple occurrences of said keyword in said sequences, further comprising treating occurrences of said keyword in multiple recognition hypotheses, where said recognition hypotheses overlap in time as indicated by said timing information, as the same keyword hypothesis.
  - 11. In the method used in the speech recognition system according to claim 10, further comprising computing time of occurrence of said keyword using time alignments from said one word sequence having highest acoustic likelihood per frame score of said keyword.

12. A computerized keyword spotting system comprising:
- a word sequence search engine coupled to receive observations extracted from a speech signal, the search engine configured to produce at least one possible word sequence explaining said observations and a likelihood score corresponding to each of said possible word sequences; and
  
  a confidence score computer coupled to said word sequence search engine configured to generate a confidence score from probabilities that said keyword is in said sequences of words given said observations, wherein said confidence score computer produces said confidence score according to the expression;
  
  ##EQU5## where;
  
  P (Obs|W) is the acoustic HMM probabilityP (W) is the language model probabilityW;
  
  (KWε
  
  W) is the list of word sequences that contain the keyword.
- View Dependent Claims (13, 14)
- - 13. The keyword spotting system according to claim 12, wherein NormalizingFactor is the expression:
    - ##EQU6##
  - 14. The keyword spotting system according to claim 12, wherein NormalizingFactor is a numerical constant.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SRI International, Inc.
Original Assignee
SRI International, Inc.
Inventors
Weintraub, Mitchel
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US08/660,144
Time in Patent Office

900 Days
Field of Search

395/2.4, 395/2.45, 395/2.48, 395/2.49, 395/2.65, 395/2.66, 395/2.64, 704/231, 704/236, 704/239, 704/240, 704/256, 704/255, 704/257
US Class Current

704/240
CPC Class Codes

G10L 15/10 using distance or distortio...

G10L 2015/088 Word spotting

Method and apparatus for computing likelihood and hypothesizing keyword appearance in speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

139 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for computing likelihood and hypothesizing keyword appearance in speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

139 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links