Matrix quantization with vector quantization error compensation for robust speech recognition

US 6,070,136 A
Filed: 10/27/1997
Issued: 05/30/2000
Est. Priority Date: 10/27/1997
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition system comprising:

a vector quantizer to receive first parameters of an input signal and generate a first quantization observation sequence;

a first speech classifier to receive the first quantization observation sequence from the vector quantizer and generate first respective speech classification output data;

a matrix quantizer to receive second parameters of the input signal and generate a second quantization observation sequence;

a second speech classifier to receive the second quantization observation sequence from the matrix quantizer and generate second respective speech classification output data; and

a hybrid decision generator to combine corresponding first and second respective speech classification data to generate third respective speech classification data and to recognize the input signal from the third respective speech classification data.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier. Matrix quantization exploits input signal information in both frequency and time domains, and the vector quantizer primarily operates on frequency domain information. However, in some circumstances, time domain information may be substantially limited which may introduce error into the matrix quantization. Information derived from vector quantization may be utilized by a hybrid decision generator to error compensate information derived from matrix quantization. Additionally, fuzz methods of quantization and robust distance measures may be introduced to also enhance speech recognition accuracy. Furthermore, other speech classification stages may be used, such as hidden Markov models which introduce probabilistic processes to further enhance speech recognition accuracy. Multiple codebooks may also be combined to form single respective codebooks for matrix and vector quantization to lessen the demand on processing resources.

Citations

44 Claims

1. A speech recognition system comprising:
- a vector quantizer to receive first parameters of an input signal and generate a first quantization observation sequence;
  
  a first speech classifier to receive the first quantization observation sequence from the vector quantizer and generate first respective speech classification output data;
  
  a matrix quantizer to receive second parameters of the input signal and generate a second quantization observation sequence;
  
  a second speech classifier to receive the second quantization observation sequence from the matrix quantizer and generate second respective speech classification output data; and
  
  a hybrid decision generator to combine corresponding first and second respective speech classification data to generate third respective speech classification data and to recognize the input signal from the third respective speech classification data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The speech recognition system as in claim 1 wherein the first and second speech classifiers are a first and second set, respectively, of hidden Markov models.
  - 3. The speech recognition system as in claim 2 wherein:
    - the speech recognition system has u vocabulary words, and u is an integer;
      
      the first respective speech classification output data includes probabilities, Pr(O_Vn |λ
      
      _Vn), n=1,2, . . . u, related to respective ones of the first set of n hidden Markov models, λ
      
      _Vn, and the first quantization observation, O_v, sequence to one of the u vocabulary words, and n is an integer;
      
      the second respective speech classification output data includes probabilities, Pr(O_Mn |λ
      
      _Mn), n=1,2, . . . u, related to respective ones of the second set of n hidden Markov models, λ
      
      _Mn, and the second quantization observation sequence, O_Mn, to one of the u vocabulary words, and n is an integer;
      
      the third classification data is D(n)=α
      
      Pr(O_Mn |λ
      
      _Mn)+Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u and α
      
      is a weighting factor;
      
      to all Pr(O_Vn |λ
      
      _Vn) to compensate for recognition errors in Pr(O_Mn |λ
      
      _Mn); and
      
      the hybrid decision generator is further capable of recognizing the input signal as the ith vocabulary word when D(i) represents the highest probability that the input signal is the ith of the u vocabulary words.
  - 4. The speech recognition system as in claim 1 wherein the vector and matrix quantizers utilize respective single codebooks.
  - 5. The speech recognition system as in claim 1 wherein the input signal for reception by the vector quantizer and matrix quantizer is a spoken word.
  - 6. The speech recognition system as in claim 1 wherein the first parameters of the input signal for reception by the vector quantizer include P order line spectral pairs of the input signal, and the second parameters of the input signal for reception by the matrix quantizer include temporally related P order line spectral pairs, wherein P is an integer.
  - 7. The speech recognition system as in claim 5 wherein P equals twelve.
  - 8. The speech recognition system as in claim 6 wherein the vector and matrix quantizers respectively are capable of determining a distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and codewords.
  - 9. The speech recognition system as in claim 8 wherein a distance measure, d(f,f), between the input signal parameters, f and the reference data parameters, f, is defined by:
    - ##EQU28## wherein f_i and f_i are the ith line spectral pair frequency parameters in the input signal and respective codewords, respectively, α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are are constants, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.
  - 10. The speech recognition system as in claim 9 wherein the constants α
    - ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize quantization error.
  - 11. The speech recognition system as in claim 8 wherein noise frequencies are primarily located in the frequency range substantially coinciding with the frequency range represented by line spectral pairs i=1 to N₁.
  - 12. The speech recognition system as in claim 6 wherein the vector and matrix quantizers respectively are capable of determining a distance measure between an ith order line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order codeword line spectral pair frequencies and (ii) a weighting of the difference by an ith frequency weighting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and codewords.
  - 13. The speech recognition system as in claim 12 wherein noise frequencies are primarily located in the frequency range represented by line spectral pairs i=1 to N₁.
  - 14. The speech recognition system as in claim 1 wherein the first parameters of the input signal include the energy of the input signal and first and second derivatives of the the input signal energy.
  - 15. The speech recognition system as in claim 1 wherein the vector and matrix quanitzers utilize fuzzy quantization.

16. A speech recognition system comprising:
- a vector quantizer to receive line spectral pair input data corresponding to an input speech signal and to generate a first quantization observation sequence;
  
  first hidden Markov models to receive the first quantization observation sequence from the vector quantizer and generate first respective speech recognition probabilities from each of the first hidden Markov models;
  
  a matrix quantizer to receive temporally associated line spectral pair input data corresponding to the input speech signal and to generate a second quantization observation sequence;
  
  second hidden Markov models to receive the second quantization observation sequence from the matrix quantizer and generate second respective speech recognition probabilities from each of the second hidden Markov models; and
  
  a hybrid decision generator to utilize the first and second respective speech recognition probabilities to combine corresponding first and second speech recognition probabilities and to recognize the input signal from the combined corresponding first and second speech recognition probabilities.
- View Dependent Claims (17, 18, 19)
- - 17. The speech recognition system as in claim 16 wherein:
    - the speech recognition system has u vocabulary words, and u is an integer;
      
      the first respective speech recognition probabilities, Pr(O_Vn |λ
      
      _Vn), n=1,2, . . . u, related to respective ones of the first of n hidden Markov models, λ
      
      _Vn, and the first quantization observation, O_v, sequence to one of the u vocabulary words, and n is an integer;
      
      the second respective speech recognition probabilities, Pr(O_Mn |λ
      
      _Mn), n=1,2, . . . u, related to respective ones of the second of n hidden Markov models, λ
      
      _Mn, and the second quantization observation sequence, O_Mn, to one of the u vocabulary words, and n is an integer;
      
      the combined first and second respective recognition probabilities are respectively D(n)=α
      
      Pr(O_Mn |λ
      
      _Vn)+Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u and α
      
      is a weighting factor to allow Pr(O_Vn |λ
      
      _Vn) to compensate for recognition errors in Pr(O_Mn |λ
      
      _Mn); and
      
      the hybrid decision generator is further capable of recognizing the input signal as the ith vocabulary word when D(i) represents the highest probability that the input signal is the ith vocabulary word.
  - 18. The speech recognition system as in claim 16 wherein:
    - the line spectral pair input data are P order line spectral pairs of the input signal, wherein P is an integer; and
      
      the vector and matrix quantizers are each respectively capable of determining respective a distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N₁ is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and codewords.
  - 19. The speech recognition system as in claim 18 wherein the distance measure, d(f, f), between the input signal parameters,f and the reference data parameters, f, is defined by:
    - ##EQU29## wherein f_i and f_i are the ith line spectral pair frequency parameters in the input signal and respective codewords, respectively, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize quantization error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.

20. An apparatus comprising:
- a first speech classifier to operate on first parameters of an input signal and provide first output data relating the input signal to reference data, wherein the input signal parameters include frequency and time domain parameters, wherein the first speech classifier further includes a first set of hidden Markov models;
  
  a second speech classifier to operate on second parameters of the input signal and to provide second output data relating the input signal to the reference data, wherein the second parameters of the input signal include the frequency domain parameters, the second speech classifier further includes a second set of hidden Markov models; and
  
  a hybrid decision generator to combine the first output data and the second output data so that the second output data compensates for errors in the first output data and to generate third output data to classify the input signal.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27)
- - 21. The apparatus as in claim 20 wherein the first speech classifier includes a fuzzy matrix quantizer, and the second speech classifier includes a fuzzy vector quantizer.
  - 22. The apparatus as in claim 20 wherein the second speech classifier is capable of operating on frequency domain parameters of the input signal.
  - 23. The apparatus as in claim 20 wherein the frequency domain parameters are P order line spectral pair frequencies, wherein P is an integer.
  - 24. The apparatus as in claim 20 wherein the first and second parameters of the input signal further include input signal energy related parameters.
  - 25. The apparatus as in claim 20 wherein:
    - the first and second parameters of the input signal each respectively include P order line spectral pairs of the input signal, wherein P is an integer; and
      
      the first and second speech classifiers are each respectively capable of determining a respective distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and codewords.
  - 26. The apparatus as in claim 25 wherein the distance measure, d(f , f), between the input signal parameters,f and the reference data parameters, f, is defined by:
    - ##EQU30## wherein f_i and f_i are the ith line spectral pair frequency parameters in the input signal and respective codewords, respectively, α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are constants, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.
  - 27. The apparatus as in claim 26 wherein the constants α
    - ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize classification error.

28. A method comprising:
- processing first parameters of an input signal using a first speech classifier, wherein the parameters include frequency and time domain parameters;
  
  providing first output data relating the input signal to reference data, wherein the first output data is provided from the first speech classifier to a second speech classifier;
  
  processing the first output data using the second speech classifier;
  
  providing second output data from the second speech classifier;
  
  processing second parameters of the input signal using a third speech classifier, wherein the parameters include frequency domain parameters;
  
  providing third output data relating the input signal to the reference data, wherein the third output data is provided from the third speech classifier to a fourth speech classifier;
  
  processing the third output data using the fourth speech classifier;
  
  providing fourth output data from the fourth speech classifier;
  
  combining the third output data and fourth output data to compensate for speech classification errors in the third output data; and
  
  classifying the input signal as recognized speech.
- View Dependent Claims (29, 30, 31, 32, 33)
- - 29. The method as in claim 28 wherein processing frequency and time domain parameters of the input signal comprises:
    - matrix quantizing the frequency and time domain parameters of the input signal; and
      
      processing frequency domain parameters of the input signal comprises;
      
      vector quantizing the frequency domain parameters of the input signal.
  - 30. The method as in claim 28 wherein combining third output data and fourth output data comprises:
    - weighting the fourth output data; and
      
      adding the weighted fourth output data to the third output data.
  - 31. The method as in claim 28 wherein:
    - the reference data represents u vocabulary words, and u is an integer;
      
      the first output data includes a first observation sequence, O_Vn, relating the input signal to the reference data;
      
      the second speech classifier includes a first set of n hidden Markov models;
      
      the second output data includes probabilities, Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u, related to respective ones of the first set of n hidden Markov models, X_Vn, and the first observation sequence, O_Vn ;
      
      the third output data includes a second observation sequence, O_Mn, relating the input signal to the reference data;
      
      the fourth speech classifier includes a second set of n hidden Markov models;
      
      the fourth output data includes probabilities, Pr(O_Mn |λ
      
      _Mn), n=1, 2, ..., u, related to respective ones of the second set of n hidden Markov models, X_Mn, and the second observation sequence, O_Mn ;
      
      combining the third output data and fourth output data comprises;
      
      combining the probabilities Pr(O_Vn |λ
      
      _Vn) and Pr(O_Mn |λ
      
      _Mn) into a combination, D(n), wherein D(n)=α
      
      Pr(O_Mn |λ
      
      _Mn)+Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u and α
      
      is a weighting factor to allow Pr(O_Vn |λ
      
      _Vn) tocompensate for speech classification errors in Pr(O_Mn |λ
      
      _Mn); and
      
      classifying the input signal as recognized speech comprises;
      
      classifying the input signal as the ith of the u vocabulary words when D(i) represents the highest probability that the input signal is the ith vocabulary word.
  - 32. The method as in claim 28 wherein:
    - the first and second parameters of the input signal each respectively include P order line spectral pairs of the input signal, wherein P is an integer;
      
      processing first parameters of the input signal comprises;
      
      determining a first distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of first codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the first codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the first codewords; and
      
      processing second parameters of the input signal comprises;
      
      determining a second distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of second codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the second codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N₁ is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the second codewords.
  - 33. The method as in claim 32 wherein the first distance measures, d(f, f), between the input signal parameters,f and the reference data parameters, f, is defined by:
    - ##EQU31## wherein d(f, f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective first codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e;
      
      is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal; and
      
      the second distance measures, d(f, f), between the input signal parameters,f, and the reference data parameters, f, is defined by;
      
      ##EQU32## wherein d(f, f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective second codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.

34. A method of recognizing speech comprising:
- receiving an input signal;
  
  determining parameters of the input signal;
  
  vector quantizing the parameters of the input signal to obtain first quantization output data;
  
  classifying the first quantization output data;
  
  matrix quantizing the parameters of the input signal to obtain second quantization output data;
  
  classifying the second quantization output data; and
  
  generating an identification of the input signal as recognized speech based upon the classification of the first and second quantization output data.
- View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42)
- - 35. The method as in claim 34 wherein generating the identification of the input signal further comprises:
    - weighting the classification of the first quantization output data; and
      
      adding a the weighted classification of the first quantization output data and the classification of the second quantization output data.
  - 36. The method as in claim 34 wherein determining parameters of the input signal comprises:
    - determining P order line spectral pairs for each of TO flames of the input signal.
  - 37. The method as in claim 34 wherein vector quantizing further comprises:
    - vector quantizing the parameters of the input signal using a first single codebook; and
      
      wherein matrix quantizing further comprises;
      
      matrix quantizing the parameters of the input signal using a second single codebook.
  - 38. The method as in claim 34 wherein vector quantizing further comprises:
    - fuzzy vector quantizing the parameters of the input signal, wherein the first quantization output data is fuzzy data; and
      
      wherein matrix quantizing further comprises;
      
      fuzzy matrix quantizing the parameters of the input signal, wherein the second quantization output data is fuzz data.
  - 39. The method as in claim 34 wherein:
    - the identification of the input signal is one of u vocabulary words, and u is an integer;
      
      the first quantization output data is a first observation sequence, O_Vn, relating the input signal to the u vocabulary words;
      
      classifying the first quantization output data comprises;
      
      determining probabilities, Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u, related to respective ones of a first set of n hidden Markov models, λ
      
      _Vn, and the first observation sequence, O_Vn ;
      
      the second quantization output data is a second observation sequence, O_Mn, relating the input signal to the u vocabulary words;
      
      classifying the first quantization output data comprises;
      
      determining probabilities, Pr(O_Mn |λ
      
      _Mn), n=1, 2, . . . , u, related to respective ones of a second set of n hidden Markov models, λ
      
      _Mn, and the second observation sequence, O_Mn ; and
      
      generating an identification of the input signal further comprises;
      
      combining the probabilities Pr(O_Vn |λ
      
      _Vn) and Pr(O_Mn |λ
      
      _Mn) into a combination, D(n), wherein D(n)=α
      
      Pr(O_Mn |λ
      
      _Mn)+Pr(O_Vn |λ
      
      _Vn), n=1, 2, . . . , u and a is a weighting factor to allow Pr(O_Vn |λ
      
      _Vn) to compensate for speech classification errors in Pr(O_Mn |λ
      
      _Mn), and the identification of the input signal is the ith of the u vocabulary words when D(i) represents the highest probability that the input signal is the ith vocabulary word.
  - 40. The method as claim 34 wherein:
    - the parameters of the input signal include P order line spectral pairs of the input signal, wherein P is an integer; and
      
      vector quantizing the parameters of the input signal comprises;
      
      determining a first distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of first codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the first codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N₁ is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the first codewords; and
      
      matrix quantizing the parameters of the input signal comprises;
      
      determining a second distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of second codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the second codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the second codewords.
  - 41. The method as in claim 40 wherein the first distance measures, d(f, f), between the input signal parameters,f, and the reference data parameters, f, is defined by:
    - ##EQU33## wherein d(f, f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective first codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal; and
      
      the second distance measures, d(f, f), between the input signal parameters,f and the reference data parameters, f, is defined by;
      
      ##EQU34## wherein d(f,f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective second codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.
  - 42. The apparatus as in claim 41 wherein the constants α
    - ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize quantization error.

43. A method of recognizing speech comprising the steps of:
- receiving an input signal;
  
  determining P order line spectral pairs for TO frames of the input signal, wherein P and TO are integers;
  
  vector quantizing the P order line spectral pairs for each of the TO frames;
  
  classifying the input signal using the vector quantization of the P order line spectral pairs;
  
  matrix quantizing the P order line spectral pairs for T matrices of frames of the input signal, wherein T is defined as int(TO/N), and N is the number for input signal frames represented in each of the T matrices;
  
  classifying the input signal using the matrix quantization of the P order line spectral pairs;
  
  combining the classifications of the input signal to generate a combination of the classifications; and
  
  recognizing the input signal as particular speech from the combination of the classifications.
- View Dependent Claims (44)
- - 44. The method as in claim 43 wherein:
    - vector quantizing the P order line spectral pairs comprises;
      
      determining a first distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of first codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the first codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N₁ is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the first codewords;
      
      matrix quantizing the P order line spectral pairs comprises;
      
      determining a second distance measure between an ith line spectral pair frequency of the input signal and respective ith order line spectral pair frequencies of a plurality of second codewords, wherein the distance measure, for i=1 to N₁, is proportional to (i) a difference between the ith input signal line spectral pair frequencies and the ith order line spectral pair frequencies of the second codewords and (ii) a shift of the difference by an ith frequency shifting factor, wherein N, is greater than or equal to one and less than or equal to P, and P is the highest order line spectral pair frequency of the input signal and the second codewords;
      
      the first distance measures, d(f, f), between the input signal parameters,f and the reference data parameters, f, is defined by;
      
      ##EQU35## wherein d (f, f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective first codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal; and
      
      the second distance measures, d(f, f), between the input signal parameters,f and the reference data parameters, f, is defined by;
      
      ##EQU36## wherein d(f, f), f_i and f_i are the ith line spectral pair frequency parameters in the input signal and the respective second codewords, the constants α
      
      ₁, α
      
      ₂, β
      
      ₁ and β
      
      ₂ are set to substantially minimize respective processing error, and e_i is the error power spectrum of the input signal and a predicted input signal at the ith line spectral pair frequency of the input signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Legerity Incorporated (Microchip Technology Incorporated)
Original Assignee
Advanced Micro Devices, Inc.
Inventors
Cong, Lin, Asghar, Safdar M.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US08/957,902
Time in Patent Office

946 Days
Field of Search

704/222, 704/256, 704/243, 704/251
US Class Current

704/222
CPC Class Codes

G10L 15/142 Hidden Markov Models [HMMs]

G10L 15/32 Multiple recognisers used i...

Matrix quantization with vector quantization error compensation for robust speech recognition

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

44 Claims

Specification

Solutions

Use Cases

Quick Links

Matrix quantization with vector quantization error compensation for robust speech recognition

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

44 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links