Method and apparatus for quantizing pitch, amplitude, phase and linear spectrum of voiced speech

US 7,426,466 B2
Filed: 07/22/2004
Issued: 09/16/2008
Est. Priority Date: 04/24/2000
Status: Expired due to Term

First Claim

Patent Images

1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:

a predictively quantized pitch lag value;

a quantized target error vector of amplitude components;

predictively quantized phase values; and

a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value δ

L_m, based on a formula;

δ

L_m=L_m−

η

_m₁L_m₁−

η

_m₂L_m₂−

. . . −

η

_m_NL_m_N,wherein the values L_m₁, L_m₂. . . , L_m_Nare the pitch lags for frames m₁, m₂, . . . , m_N, respectively and the values η

_m₁, η

_m₂, . . . , η

_m_Nare weights corresponding to frames m₁, m₂, . . . , m_N, respectively.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain representation. The quantizer is configured to subtract a weighted sum of the parameters for previous frames from the parameter for the current frame. The quantizer is configured to quantize the difference value. A prototype extractor may be added to first extract a pitch period prototype to be processed by the parameter generator.

Citations

24 Claims

1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value δ
  
  L_m, based on a formula;
  
  δ
  
  L_m=L_m−
  
  η
  
  _m₁L_m₁−
  
  η
  
  _m₂L_m₂−
  
  . . . −
  
  η
  
  _m_NL_m_N,wherein the values L_m₁, L_m₂. . . , L_m_Nare the pitch lags for frames m₁, m₂, . . . , m_N, respectively and the values η
  
  _m₁, η
  
  _m₂, . . . , η
  
  _m_Nare weights corresponding to frames m₁, m₂, . . . , m_N, respectively.

2. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
  
  A_m) that is described by a formula;
  
  δ
  
  A_m=A_m−
  
  α
  
  _m₁^TA_m₁−
  
  α
  
  _m₂^TA_m₂−
  
  . . . −
  
  α
  
  _m_N^TA_m_N,wherein the values A_m₁, A_m₂. . . , A_m_Nare a subset of the amplitude vector for frames m₁, m₂, . . . , m_N, respectively, and the values α
  
  _m₁^T, α
  
  _m₂^T, . . . , α
  
  _m_N^Tare the transposes of corresponding weight vectors.

3. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
  
  φ
  
  _m=φ
  
  ′
  
  _m−
  
  1,wherein φ
  
  ′
  
  _m−
  
  1represent the phases of an extracted prototype.

4. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (T_Mⁿ) that is described by a formula;

5. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
  
  L_m, based on a formula;
  
  δ
  
  L_m=L_m−
  
  η
  
  _m₁L_m₁−
  
  η
  
  _m₂L_m₂−
  
  . . . −
  
  η
  
  _m_NL_m_N,wherein the values L_m₁, L_m₂. . . , L_m_Nare the pitch lags for frames m₁, m₂, . . . , m_N, respectively and the values η
  
  _m₁, η
  
  _m₂, . . . , η
  
  _m_Nare weights corresponding to frames m₁, m₂, . . . , m_N, respectively.

6. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
  
  A_m) that is described by a formula;
  
  δ
  
  A_m=A_m−
  
  α
  
  _m₁^TA_m₁−
  
  α
  
  _m₂^TA_m₂−
  
  . . . −
  
  α
  
  _m_N^TA_m_N,wherein the values A_m₁, A_m₂. . . , A_m_Nare a subset of the amplitude vector for frames m₁, m₂, . . . , m_N, respectively, and the values α
  
  _m₁^T, α
  
  _m₂^T, . . . , α
  
  _m_N^Tare the transposes of corresponding weight vectors.

7. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on-a formula;
  
  φ
  
  _m=φ
  
  ′
  
  _m−
  
  1,wherein φ
  
  ′
  
  _m−
  
  1represent the phases of an extracted prototype.

8. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (T_Mⁿ) that is described by a formula;

9. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.

10. A method for forming a set of quantized speech frame parameters, comprising:
- quantizing a pitch lag value;
  
  quantizing a target error vector of amplitude components;
  
  quantizing phase values; and
  
  quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising transmitting the set of quantized speech frame parameters across a wireless communication channel.

11. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values;
  
  means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and
  
  means for transmitting a packet of the quantized error vectors across a wireless communication channel.

12. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values; and
  
  means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
  
  L_m, based on formula;
  
  δ
  
  L_m=L_m−
  
  η
  
  _m₁L_m₁−
  
  η
  
  _m₂L_m₂−
  
  . . . −
  
  η
  
  _m_NL_m_N,wherein the values L_m₁, L_m₂. . . , L_m_Nare the pitch lags for frames m₁, m₂, . . . , m_N, respectively and the values η
  
  _m₁, η
  
  _m₂. . . , η
  
  _m_Nare weights corresponding to frames m₁, m₂, . . . , m_N, respectively.

13. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values; and
  
  means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components,the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
  
  A_m) that is described by a formula;
  
  δ
  
  A_m=A_m−
  
  α
  
  _m₁^TA_m₁−
  
  α
  
  _m₂^TA_m₂−
  
  . . . −
  
  α
  
  _m_N^TA_m_N,wherein the values A_m₁, A_m₂. . . , A_m_Nare a subset of the amplitude vector for frames m₁, m₂, . . . , m_N, respectively, and the values α
  
  _m₁^T, α
  
  _m₂^T, . . . , α
  
  _m_N^Tare the transposes of corresponding weight vectors.

14. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values; and
  
  means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
  
  φ
  
  _m=φ
  
  ′
  
  _m−
  
  1,wherein φ
  
  ′
  
  _m−
  
  1represent the phases of an extracted prototype.

15. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values; and
  
  means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (T_Mⁿ) that is described by a formula;

16. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, amplitude components, phase values, and the linear spectral information components have been extracted from a voiced speech frame,the processor being further operable to execute a set of instructions stored in a storage medium to extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.

17. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising:
- a predictively quantized pitch lag value;
  
  a quantized target error vector of amplitude components;
  
  predictively quantized phase values; and
  
  a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame,the processor being further operable to execute a set of instructions stored in a storage medium to transmit the set of quantized speech frame parameters across a wireless communication channel.

18. An apparatus comprising:
- means for quantizing a pitch lag value;
  
  means for quantizing a target error vector of amplitude components;
  
  means for quantizing phase values;
  
  means for quantizing a target error vector of linear spectral information components,wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and
  
  means for extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.

19. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values; and
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δ
  
  L_m, based on a formula;
  
  δ
  
  L_m=L_m−
  
  η
  
  _m₁L_m₁−
  
  η
  
  _m₂L_m₂−
  
  . . . −
  
  η
  
  _m_NL_m_N,wherein the values L_m₁, L_m₂. . . , L_m_Nare the pitch lags for frames m₁,m₂, . . . m_N, respectively and the values η
  
  _m₁, η
  
  _m₂. . . ,η
  
  _m_Nare weights corresponding to frames m₁m₂, . . . m_N, respectively.

20. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values; and
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δ
  
  A_m) that is described by a formula;
  
  δ
  
  A_m=A_m−
  
  α
  
  _m₁^TA_m₁−
  
  α
  
  _m₂^TA_m₂−
  
  . . . −
  
  α
  
  _m_N^TA_m_N,wherein the values A_m₁,A_m₂. . . , A_m_Nare a subset of the amplitude vector for frames m₁,m₂, . . . , m_N, respectively, and the values α
  
  _m₁^T, α
  
  _m₁^T, α
  
  _m₂^T, . . . , α
  
  _m_N^Tare the transposes of corresponding weight vectors.

21. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values; and
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula;
  
  φ
  
  _m=φ
  
  ′
  
  _m−
  
  1wherein φ
  
  ′
  
  _m−
  
  1represent the phases of an extracted prototype.

22. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values; and
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (Tb) that is described by a formula;

23. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values;
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and
  
  extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames.

24. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to:
- quantize a pitch lag value;
  
  quantize a target error vector of amplitude components;
  
  quantize phase values;
  
  quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and
  
  transmit the set of quantized speech frame parameters across a wireless communication channel.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
DeJaco, Andrew P., Ananthapadmanabhan, Arasanipalai K., Huang, Pengjun, Manjunath, Sharath, Choy, Eddie-Lun Tik
Primary Examiner(s)
Hudspeth; Donald R.
Assistant Examiner(s)
Kovacek; David

Application Number

US10/897,746
Publication Number

US 20040260542A1
Time in Patent Office

1,517 Days
Field of Search

704200-231, 704E19001-E19049, 704E21001-E2102
US Class Current

704/230
CPC Class Codes

G10L 19/0204   using subband decomposition

G10L 19/032   Quantisation or dequantisat...

G10L 19/04   using predictive techniques

G10L 19/08   Determination or coding of ...

G10L 19/097   using prototype waveform de...

G10L 19/26   Pre-filtering or post-filte...

G10L 25/12   the extracted parameters be...

Method and apparatus for quantizing pitch, amplitude, phase and linear spectrum of voiced speech

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for quantizing pitch, amplitude, phase and linear spectrum of voiced speech

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links