Method and apparatus for speech reconstruction in a distributed speech recognition system
First Claim
1. In a distributed speech recognition system comprising a first communication device which receives a speech input, encodes data representative of the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, a method of reconstructing the speech input at the second communication device comprising the steps of:
- receiving encoded data including encoded spectral data and encoded energy data;
decoding the encoded spectral data and encoded energy data to determine the spectral data and energy data; and
combining the spectral data and energy data to reconstruct the speech input.
4 Assignments
0 Petitions
Accused Products
Abstract
In a distributed speech recognition system (20) comprising a first communication device (22) which receives a speech input (34), encodes data representative of the speech input (36, 38), and transmits the encoded data (42) and a second remotely-located communication device (26) which receives the encoded data (44) and compares the encoded data with a known data set, the device (26) including a processor (92) with a program which controls the processor (92) to operate according to a method of reconstructing the speech input including the step (44) of receiving encoded data including encoded spectral data and encoded energy data. The method further includes the step (46, 48) of decoding the encoded spectral data and encoded energy data to determine the spectral data and energy data. The method also includes the step (50, 52) of combining the spectral data and energy data to reconstruct the speech input.
-
Citations
22 Claims
-
1. In a distributed speech recognition system comprising a first communication device which receives a speech input, encodes data representative of the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, a method of reconstructing the speech input at the second communication device comprising the steps of:
-
receiving encoded data including encoded spectral data and encoded energy data;
decoding the encoded spectral data and encoded energy data to determine the spectral data and energy data; and
combining the spectral data and energy data to reconstruct the speech input. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. In a distributed speech recognition system comprising a first communication device which receives a speech input, encodes data representative of the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, a method of reconstructing the speech input at the second communication device comprising the steps of:
-
receiving encoded data including encoded spectral data spectral data encoded as a series of mel-frequency cepstral coefficients and encoded energy data;
performing an inverse discrete cosine transform on the mel-frequency cepstral coefficients at harmonic mel-frequencies corresponding to a pitch period of the speech input to determine log-spectral magnitudes of the speech input at the mel-harmonic frequencies; and
exponentiating the log-spectral magnitudes to determine the spectral magnitudes of the speech input;
decoding the encoded energy data to determine the energy data; and
combining the spectral magnitudes and the energy data to reconstruct the speech input. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22)
-
-
17. In a distributed speech recognition system comprising a first communication device which receives a speech input, encodes data about the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, the second remotely-located communication device comprising:
-
a processor including a program which controls the processor (i) to receive the encoded data including encoded spectral data spectral data encoded as a series of mel-frequency cepstral coefficients and encoded energy data, (ii) to perform an inverse discrete cosine transform on the mel-frequency cepstral coefficients at harmonic mel-frequencies corresponding to a pitch period of the speech input to determine log-spectral magnitudes of the speech input at the harmonic frequencies, (iii) to exponentiate the log-spectral magnitudes to determine the spectral magnitudes of the speech input, and (iv) to decode the encoded energy data to determine the energy data; and
a speech synthesizer which combines the spectral magnitudes and the energy data to reconstruct the speech input.
-
Specification