Method for word spotting in continuous speech
First Claim
Patent Images
1. In a speech recognition system, a subsystem for spotting pre-specified words, comprising:
- a sub-word model analyzer using context independent phoneme models and having an input connected to a source of continuous speech;
a full-word model analyzer using triphone based models which may coincide with the context independent phoneme models, the full-word model analyzer [and]having an input connected to said source of continuous speech;
said sub-word model analyzer having a first threshold output signal indicative of the presence of one or more said phoneme models;
said full-word model analyzer having a second threshold indicative of the probability of the presence of a pre-specified word, represented by a triphone based model; and
a pre-specified word detecting means having an input coupled to said sub-word model analyzer, for receiving said first threshold output, and coupled to said full-word model analyzer, for receiving said second threshold, and in response thereto, identifying a pre-specified word to be spotted, provided the probability of the presence of the pre-specified word, represented as a triphone model is greater than the probability of the presence of the phoneme model.
1 Assignment
0 Petitions
Accused Products
Abstract
A digitized speech data channel is analyzes for the presence of words or phrases from a desired list. If the word is not present the speech time series from the channel, there is a high probability of matching the input data to phonemic noise markers rather than to word models. This is accomplished through the use of a competitive algorithm wherein Hidden Markov Models (HMMs) of the desired words or phrases compete with an alternative HMM of a set of phonemes. The set of phonemes can be chosen in order to reduce the computer resources required for the channel analysis.
-
Citations
8 Claims
-
1. In a speech recognition system, a subsystem for spotting pre-specified words, comprising:
-
a sub-word model analyzer using context independent phoneme models and having an input connected to a source of continuous speech; a full-word model analyzer using triphone based models which may coincide with the context independent phoneme models, the full-word model analyzer [and]having an input connected to said source of continuous speech; said sub-word model analyzer having a first threshold output signal indicative of the presence of one or more said phoneme models; said full-word model analyzer having a second threshold indicative of the probability of the presence of a pre-specified word, represented by a triphone based model; and a pre-specified word detecting means having an input coupled to said sub-word model analyzer, for receiving said first threshold output, and coupled to said full-word model analyzer, for receiving said second threshold, and in response thereto, identifying a pre-specified word to be spotted, provided the probability of the presence of the pre-specified word, represented as a triphone model is greater than the probability of the presence of the phoneme model. - View Dependent Claims (2, 3, 4)
-
-
5. In a speech recognition system, a method for spotting pre-specified words, comprising the steps of:
-
analyzing with a sub-word model analyzer using context independent phoneme models, a source of continuous speech; analyzing with a full-word model analyzer using triphone based models which may coincide with the context independent phoneme models, said source of continuous speech; providing from said sub-word model analyzer a first threshold output signal indicative of the presence of one or more said phoneme models; providing from said full-word model analyzer a second threshold output signal indicative of the probability of the presence of a pre-specified word, represented by a triphone based model; detecting a pre-specified word with a pre-specified word detecting means, having an input coupled to said sub-word model analyzer, for receiving said first threshold signal, and coupled to said full-word model analyzer, for receiving said second threshold signal; and identifying a pre-specified word in said source of continuous speech, provided the probability of the presence of the prespecified word is greater than the probability of the presence of the phoneme model. - View Dependent Claims (6, 7, 8)
-
Specification