Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer
First Claim
1. A speech coding apparatus comprising:
- means for measuring the values of at least first and second different features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
means for storing a plurality of single-dimension prototype vector signals, each single-dimension prototype vector signal having at least one parameter value, at least two single-dimension prototype vector signals being first-dimension prototype vector signals having parameter values representing first feature values, at least two other single-dimension prototype vector signals being second-dimension prototype vector signals having parameter values representing second feature values;
means for storing a plurality of compound-dimension prototype vector signals, each compound-dimension prototype vector signal having a unique identification value, each compound-dimension prototype vector signal comprising one first-dimension prototype vector signal and one second-dimension prototype vector signal, at least two compound-dimension prototype vector signals comprising the same first-dimension prototype vector signal;
means for comparing the closeness of the feature values of a feature vector signal to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores for the feature vector signal and each compound-dimension prototype vector signal; and
means for outputting at least the identification value of the compound-dimension prototype vector signal having the best prototype match score as a coded representation signal of the feature vector signal.
1 Assignment
0 Petitions
Accused Products
Abstract
In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores. The identification values of the compound-dimension prototype vector signals having the best prototype match scores for the feature vectors signals are output as a sequence of coded representations of an utterance to be recognized. A match score, comprising an estimate of the closeness of a match between a speech unit and the sequence of coded representations of the utterance, is generated for each of a plurality of speech units. At least one speech subunit, of one or more best candidate speech units having the best match scores, is displayed.
25 Citations
28 Claims
-
1. A speech coding apparatus comprising:
-
means for measuring the values of at least first and second different features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; means for storing a plurality of single-dimension prototype vector signals, each single-dimension prototype vector signal having at least one parameter value, at least two single-dimension prototype vector signals being first-dimension prototype vector signals having parameter values representing first feature values, at least two other single-dimension prototype vector signals being second-dimension prototype vector signals having parameter values representing second feature values; means for storing a plurality of compound-dimension prototype vector signals, each compound-dimension prototype vector signal having a unique identification value, each compound-dimension prototype vector signal comprising one first-dimension prototype vector signal and one second-dimension prototype vector signal, at least two compound-dimension prototype vector signals comprising the same first-dimension prototype vector signal; means for comparing the closeness of the feature values of a feature vector signal to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores for the feature vector signal and each compound-dimension prototype vector signal; and means for outputting at least the identification value of the compound-dimension prototype vector signal having the best prototype match score as a coded representation signal of the feature vector signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of speech coding comprising the steps of:
-
measuring the values of at least first and second different features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; storing a plurality of single-dimension prototype vector signals, each single-dimension prototype vector signal having at least one parameter value, at least two single-dimension prototype vector signals being first-dimension prototype vector signals having parameter values representing first feature values, at least two other single-dimension prototype vector signals being second-dimension prototype vector signals having parameter values representing second feature values; storing a plurality of compound-dimension prototype vector signals, each compound-dimension prototype vector signal having a unique identification value, each compound-dimension prototype vector signal comprising one first-dimension prototype vector signal and one second-dimension prototype vector signal, at least two compound-dimension prototype vector signals comprising the same first-dimension prototype vector signal; comparing the closeness of the feature values of a feature vector signal to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores for the feature vector signal and each compound-dimension prototype vector signal; and outputting at least the identification value of the compound-dimension prototype vector signal having the best prototype match score as a coded representation signal of the feature vector signal. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A speech recognition apparatus comprising:
-
means for measuring the values of at least first and second different features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; means for storing a plurality of single-dimension prototype vector signals, each single-dimension prototype vector signal having at least one parameter value, at least two single-dimension prototype vector signals being first-dimension prototype vector signals having parameter values representing first feature values, at least two other single-dimension prototype vector signals being second-dimension prototype vector signals having parameter values representing second feature values; means for storing a plurality of compound-dimension prototype vector signals, each compound-dimension prototype vector signal having a unique identification value, each compound-dimension prototype vector signal comprising one first-dimension prototype vector signal and one second-dimension prototype vector signal, at least two compound-dimension prototype vector signals comprising the same first-dimension prototype vector signal; means for comparing the closeness of the feature values of each feature vector signal to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores for each feature vector signal and each compound-dimension prototype vector signal; means for outputting the identification values of the compound-dimension prototype vector signals having the best prototype match scores for the feature vector signals as a sequence of coded representations of an utterance to be recognized; means for generating a match score for each of a plurality of speech units, each match score comprising an estimate of the closeness of a match between the speech unit and the sequence of coded representations of the utterance, each speech unit comprising one or more speech subunits; means for identifying one or more best candidate speech units having the best match scores; and means for outputting at least one speech subunit of one or more of the best candidate speech units. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
-
23. A speech recognition method comprising:
-
measuring the values of at least first and second different features of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; storing a plurality of single-dimension prototype vector signals, each single-dimension prototype vector signal having at least one parameter value, at least two single-dimension prototype vector signals being first-dimension prototype vector signals having parameter values representing first feature values, at least two other single-dimension prototype vector signals being second-dimension prototype vector signals having parameter values representing second feature values; storing a plurality of compound-dimension prototype vector signals, each compound-dimension prototype vector signal having a unique identification value, each compound-dimension prototype vector signal comprising one first-dimension prototype vector signal and one second-dimension prototype vector signal, at least two compound-dimension prototype vector signals comprising the same first-dimension prototype vector signal; comparing the closeness of the feature values of each feature vector signal to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores for each feature vector signal and each compound-dimension prototype vector signal; outputting the identification values of the compound-dimension prototype vector signals having the best prototype match scores for the feature vector signals as a sequence of coded representations of an utterance to be recognized; generating a match score for each of a plurality of speech units, each match score comprising an estimate of the closeness of a match between the speech unit and the sequence of coded representations of the utterance, each speech unit comprising one or more speech subunits; identifying one or more best candidate speech units having the best match scores; and outputting at least one speech submit of one or more of the best candidate speech units. - View Dependent Claims (24, 25, 26, 27)
-
-
28. A speech coding apparatus comprising:
-
means for measuring the value of at least a first feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values; means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value representing a first feature value and having a unique identification value; means for comparing the closeness of the feature value of a feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal; and means for outputting at least the identification value of the prototype vector signal having the best prototype match score as a coded representation signal of the feature vector signal; characterized in that the comparison means comprises means for comparing the closeness of the feature value of an uncoded feature vector signal to the parameter value of the prototype vector signal having the best prototype match score for an immediately preceding feature vector signal prior to comparing the feature value of the uncoded feature vector signal to the parameter values of other prototype vector signals.
-
Specification