Keyword/non-keyword classification in isolated word speech recognition
First Claim
1. A method to establish whether a speech signal comprising digitized speech represents a keyword, said method comprising the steps of:
- transforming said digitized speech signal into feature vectors;
processing said feature vectors in a Hidden Markov Model (HMM) keyword detector, said (HMM) keyword detector having output signals representing speech segmentation information and signals representing scores of a set of keywords compared to said digitized speech signal;
forming a discriminating vector by deriving mean vectors from said feature vectors and concatenating said mean vectors with said segmentation information;
non-linearly processing said discriminating vector to derive a first set of weighting factors, and linearly combining said feature vectors and said discriminating vector using said first set of weighting factors to develop a first set of confidence scores;
processing said first set of confidence scores and said signals representing keyword scores from said HMM keyword detector with a second weighting factor to develop a second confidence score; and
comparing said second confidence score to a threshold to determine whether a keyword has been detected.
9 Assignments
0 Petitions
Accused Products
Abstract
A two-pass classification system and method that post-processes HMM scores with additional confidence scores to derive a value that may be applied to a threshold on which a keyword verses non-keyword determination may be based. The first stage comprises Generalized Probabilistic Descent (GPD) analysis which uses feature vectors of the spoken words and the HMM segmentation information (developed by the HMM detector during processing) as inputs to develop a first set of confidence scores through a linear combination (a weighted sum) of the feature vectors of the speech. The second stage comprises a linear discrimination method that combines the HMM scores and the confidence scores from the GPD stage with a weighted sum to derive a second confidence score. The output of the second stage may then be compared to a predetermined threshold to determine whether the spoken word or words include a keyword.
70 Citations
7 Claims
-
1. A method to establish whether a speech signal comprising digitized speech represents a keyword, said method comprising the steps of:
-
transforming said digitized speech signal into feature vectors; processing said feature vectors in a Hidden Markov Model (HMM) keyword detector, said (HMM) keyword detector having output signals representing speech segmentation information and signals representing scores of a set of keywords compared to said digitized speech signal; forming a discriminating vector by deriving mean vectors from said feature vectors and concatenating said mean vectors with said segmentation information; non-linearly processing said discriminating vector to derive a first set of weighting factors, and linearly combining said feature vectors and said discriminating vector using said first set of weighting factors to develop a first set of confidence scores; processing said first set of confidence scores and said signals representing keyword scores from said HMM keyword detector with a second weighting factor to develop a second confidence score; and comparing said second confidence score to a threshold to determine whether a keyword has been detected. - View Dependent Claims (2, 3)
-
-
4. A keyword detection apparatus that determines whether a digitized speech signal includes one of a preselected plurality of keywords, said apparatus comprising:
-
means for receiving input signals representing digitized speech and developing a plurality of signals representing feature vectors of said digitized speech; means responsive to said input signals and said signals representing feature vectors of said digitized speech for developing segmentation information regarding said speech signals and a plurality of HMM keyword scores by comparing said speech signals to each of said preselected plurality of keywords, means for receiving said feature vectors and said segmentation information and combining them to determine a first set of confidence scores; means for receiving said HMM keyword scores and said first confidence scores and combining them to determine a second confidence score; and means for comparing said second confidence score against a threshold value for determining whether the keyword having the highest score is present in said input signals. - View Dependent Claims (5, 6)
-
-
7. A method to establish whether a speech signal comprising digitized speech represents a keyword, said method comprising the steps of:
-
processing said speech signal into a plurality of feature vectors; processing said speech signal by a Hidden Markov Model (HMM) keyword detector, said HMM keyword detector developing signals representing speech segmentation information and signals representing keyword scores of a set of keywords compared to said speech signal; forming a discriminating vector by deriving mean vectors from said feature vectors and concatenating said mean vectors with said segmentation information; non-linearly processing said discriminating vectors to develop a first plurality of weighting factors using general probabilistic descent training and linearly combining said feature vectors and said discriminating vector using said first plurality of weighting factors to develop a first set of confidence scores; processing said first confidence scores and said signals representing keyword scores from said HMM keyword detector with second weighting factors to develop a second confidence score, said second weighting factors being derived using Fisher'"'"'s linear discrimination training; and comparing said second confidence score to a threshold to determine whether a keyword has been detected.
-
Specification