Method and apparatus for quantizing pitch, amplitude, phase and linear spectrum of voiced speech
First Claim
1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
a quantized target error vector of amplitude components;
predictively quantized phase values; and
a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value δ
Lm, based on a formula;
δ
Lm=Lm−
η
m1Lm1−
η
m2Lm2−
. . . −
η
mNLmN,wherein the values Lm1, Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values η
m1, η
m2, . . . , η
mN are weights corresponding to frames m1, m2, . . . , mN, respectively.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain representation. The quantizer is configured to subtract a weighted sum of the parameters for previous frames from the parameter for the current frame. The quantizer is configured to quantize the difference value. A prototype extractor may be added to first extract a pitch period prototype to be processed by the parameter generator.
-
Citations
24 Claims
-
1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value δ
Lm, based on a formula;
δ
Lm=Lm−
η
m1 Lm1 −
η
m2 Lm2 −
. . . −
η
mN LmN ,wherein the values Lm 1 , Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values η
m1 , η
m2 , . . . , η
mN are weights corresponding to frames m1, m2, . . . , mN, respectively.
-
-
2. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
Am) that is described by a formula;
δ
Am=Am−
α
m1 TAm1 −
α
m2 TAm2 −
. . . −
α
mN TAmN ,wherein the values Am 1 , Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values α
m1 T, α
m2 T, . . . , α
mN T are the transposes of corresponding weight vectors.
-
-
3. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
φ
m=φ
′
m−
1,wherein φ
′
m−
1 represent the phases of an extracted prototype.
-
-
4. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn) that is described by a formula;
-
-
5. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
Lm , based on a formula;
δ
Lm=Lm−
η
m1 Lm1 −
η
m2 Lm2 −
. . . −
η
mN LmN ,wherein the values Lm 1 , Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values η
m1 , η
m2 , . . . , η
mN are weights corresponding to frames m1, m2, . . . , mN, respectively.
-
-
6. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
Am) that is described by a formula;
δ
Am=Am−
α
m1 TAm1 −
α
m2 TAm2 −
. . . −
α
mN TAmN ,wherein the values Am 1 , Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values α
m1 T, α
m2 T, . . . , α
mN T are the transposes of corresponding weight vectors.
-
-
7. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on-a formula;
φ
m=φ
′
m−
1,wherein φ
′
m−
1 represent the phases of an extracted prototype.
-
-
8. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn) that is described by a formula;
-
-
9. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.
-
-
10. A method for forming a set of quantized speech frame parameters, comprising:
-
quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising transmitting the set of quantized speech frame parameters across a wireless communication channel.
-
-
11. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and means for transmitting a packet of the quantized error vectors across a wireless communication channel.
-
-
12. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
Lm, based on formula;
δ
Lm=Lm−
η
m1 Lm1 −
η
m2 Lm2 −
. . . −
η
mN LmN ,wherein the values Lm 1 , Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values η
m1 , η
m2 . . . , η
mN are weights corresponding to frames m1, m2, . . . , mN, respectively.
-
-
13. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components,the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
Am) that is described by a formula;
δ
Am=Am−
α
m1 TAm1 −
α
m2 TAm2 −
. . . −
α
mN TAmN ,wherein the values Am 1 , Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values α
m1 T, α
m2 T, . . . , α
mN T are the transposes of corresponding weight vectors.
-
-
14. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
φ
m=φ
′
m−
1,wherein φ
′
m−
1 represent the phases of an extracted prototype.
-
-
15. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn ) that is described by a formula;
-
-
16. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, amplitude components, phase values, and the linear spectral information components have been extracted from a voiced speech frame, the processor being further operable to execute a set of instructions stored in a storage medium to extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.
-
-
17. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
-
a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, the processor being further operable to execute a set of instructions stored in a storage medium to transmit the set of quantized speech frame parameters across a wireless communication channel.
-
-
18. An apparatus comprising:
-
means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and means for extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.
-
-
19. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
Lm, based on a formula;
δ
Lm=Lm−
η
m1 Lm1 −
η
m2 Lm2 −
. . . −
η
mN LmN ,wherein the values Lm 1 , Lm2 . . . , LmN are the pitch lags for frames m1,m2, . . . mN, respectively and the values η
m1 , η
m2 . . . ,η
mN are weights corresponding to frames m1m2, . . . mN, respectively.
-
-
20. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
Am) that is described by a formula;
δ
Am=Am−
α
m1 TAm1 −
α
m2 TAm2 −
. . . −
α
mN TAmN ,wherein the values Am 1 ,Am2 . . . , AmN are a subset of the amplitude vector for frames m1,m2, . . . , mN, respectively, and the values α
m1 T, α
m1 T, α
m2 T, . . . , α
mN T are the transposes of corresponding weight vectors.
-
-
21. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components;
quantize phase values; andquantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
φ
m=φ
′
m−
1wherein φ
′
m−
1 represent the phases of an extracted prototype.
-
-
22. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (Tb) that is described by a formula;
-
-
23. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.
-
-
24. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
-
quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and transmit the set of quantized speech frame parameters across a wireless communication channel.
-
Specification