Linear discriminant based sound class similarities with unit value normalization
First Claim
1. A speech recognition processor for processing an input speech utterance in a speech recognition system, comprising:
- a spectral measure module receptive of the input speech utterance for computing spectral measures of the input speech utterance for predetermined time frames;
a time spectral pattern stage for concatenating a plurality of successive spectral measures for generating a spectral pattern vector;
a linear discriminant module for computing an initial raw similarity value for each of a plurality of sound classes by computing the dot product of a linear discriminant vector with the time spectral pattern vector;
a normalization module which accesses normalized values computed based upon training utterances, said normalization module finding corresponding normalized values for each said initial raw similarity value to provide a normalized similarity value and concatenating normalized similarity values to form a similarity vector, said initial raw similarity value concatenating the initial raw similarity values to form a similarity vector; and
a word matcher module for comparing said similarity vector with pre-stored reference vectors.
1 Assignment
0 Petitions
Accused Products
Abstract
A common requirement in automatic speech recognition is to recognize a set of words for any speaker without training the system for each new speaker. A speech recognition system is provided utilizing linear discriminant based phonetic similarities with inter-phonetic unit value normalization. Linear discriminant analysis is utilized using training data with both in-class and out-class sample training utterances for generating linear discriminant vectors for each of the phonetic units. The dot product of each linear discriminant vector and the time spectral pattern vectors generated from the input speech are computed. The resultant raw similarity vectors are then normalized utilizing normalization look-up tables for providing similarity vectors which are utilized by a word matcher for word recognition.
-
Citations
11 Claims
-
1. A speech recognition processor for processing an input speech utterance in a speech recognition system, comprising:
-
a spectral measure module receptive of the input speech utterance for computing spectral measures of the input speech utterance for predetermined time frames; a time spectral pattern stage for concatenating a plurality of successive spectral measures for generating a spectral pattern vector; a linear discriminant module for computing an initial raw similarity value for each of a plurality of sound classes by computing the dot product of a linear discriminant vector with the time spectral pattern vector; a normalization module which accesses normalized values computed based upon training utterances, said normalization module finding corresponding normalized values for each said initial raw similarity value to provide a normalized similarity value and concatenating normalized similarity values to form a similarity vector, said initial raw similarity value concatenating the initial raw similarity values to form a similarity vector; and a word matcher module for comparing said similarity vector with pre-stored reference vectors. - View Dependent Claims (2, 3, 4, 6, 7, 8)
-
-
5. A method for processing an input speech utterances for speech recognition, comprising:
-
representing the input speech utterance as a spectral measure for predetermined time frames; generating a time-spectral pattern vector by concatenating together a plurality of spectral measures; computing the dot product of said time-spectral pattern vector with a linear discriminant vector to produce an initial similarity value; normalizing said preliminary similarity value by applying the normalization function generated based upon training utterances to the initial similarity value to create a normalized similarity value and concatenating normalized similarity values from multiple discriminate vectors associated with multiple sound classes to form a normalized similarity vector; and performing a word match with a list of word candidates based upon said normalized similarity vector. - View Dependent Claims (9, 10, 11)
-
Specification