Encoding and decoding speech signals variably based on signal classification

US 6,735,567 B2
Filed: 04/08/2003
Issued: 05/11/2004
Est. Priority Date: 09/22/1999
Status: Expired due to Term

First Claim

Patent Images

1. A variable rate speech compression system for processing a frame of a speech signal to form an encoded speech signal, the speech compression system comprising:

means for generating a first portion of the encoded speech signal as a function of a type classification and a rate selection of the frame;

means for generating a second portion of the encoded speech signal as a function of the type classification and the rate selection;

means for receiving the encoded speech signal and reconstructing linear prediction coefficients for the frame as a function of the rate selection;

means for receiving the encoded speech signal and reconstructing short term excitation as a function of the rate selection and the type classification of the frame; and

means for assembling the short-term excitation and the linear prediction coefficients to generate synthesized speech;

where the means for receiving the encoded speech signal and reconstructing the excitation is operable to reconstruct the short term excitation on a subframe basis when the type classification of the frame is type zero.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

159 Citations

71 Claims

1. A variable rate speech compression system for processing a frame of a speech signal to form an encoded speech signal, the speech compression system comprising:
- means for generating a first portion of the encoded speech signal as a function of a type classification and a rate selection of the frame;
  
  means for generating a second portion of the encoded speech signal as a function of the type classification and the rate selection;
  
  means for receiving the encoded speech signal and reconstructing linear prediction coefficients for the frame as a function of the rate selection;
  
  means for receiving the encoded speech signal and reconstructing short term excitation as a function of the rate selection and the type classification of the frame; and
  
  means for assembling the short-term excitation and the linear prediction coefficients to generate synthesized speech;
  
  where the means for receiving the encoded speech signal and reconstructing the excitation is operable to reconstruct the short term excitation on a subframe basis when the type classification of the frame is type zero.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The variable rate speech compression system of claim 1, where the means for generating a first portion of the encoded speech signal is operable to encode parameters of the speech signal representative of the frame.
  - 3. The variable rate speech compression system of claim 1, where the means for generating a second portion of the encoded speech signal is operable to encode parameters of the speech signal representative of each of a plurality of subframes of the frame.
  - 4. The variable rate speech compression system of claim 1, where the means for generating the first portion of the encoded speech signal comprises means for determining the rate selection and the type classification of the frame.
  - 5. The variable rate speech compression system of claim 1, where the means for generating the first portion of the encoded speech signal comprises means for pitch pre-processing the speech signal prior to generating the first portion.
  - 6. The variable rate speech compression system of claim 1, further comprising means for filtering and compensating the synthesized speech as a function of the rate selection.
  - 7. The variable rate speech compression system of claim 1, where the means for receiving the encoded speech signal and reconstructing the excitation is operable to reconstruct the short term excitation on a subframe basis and on a frame basis when the type classification of the frame is type one.

8. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the post-processing module comprises a long-term filter module operable to perform a fine-tuning search for a pitch period of the synthesized speech.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 9. The speech compression system of claim 8, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.
  - 10. The speech compression system of claim 8, where the fine-tuning search is performed as a function of the selected bit rate.
  - 11. The speech compression system of claim 8, where the fine-tuning search comprises pitch correlation and gain controlled harmonic filtering, where the gain controlled harmonic filtering is dependent on the selected bit rate.
  - 12. The speech compression system of claim 8, where the linear prediction coefficient reconstruction module further comprises an interpolation module when the selected bit rate is a full rate and the type classification is type zero.
  - 13. The speech compression system of claim 8, where the linear prediction coefficient reconstruction module further comprises a predictor switch module when the selected bit rate a half rate.
  - 14. The speech compression system of claim 8, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis when the type classification is type zero.
  - 15. The speech compression system of claim 8, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 16. The speech compression system of claim 8, where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.
  - 17. The speech compression system of claim 8, where the selected bit rate is 8.5 kilobits per second.
  - 18. The speech compression system of claim 8, where the selected bit rate is 4.0 kilobits per second.

19. A system for processing a speech signal to generate synthesized speech, the speech compression system comprising:
- a first decoder operable to decode a first frame of the speech signal as a function of a rate selected during encoding of the first frame, the first decoder comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients of the speech signal; and
  
  a plurality of excitation reconstruction modules operable to reconstruct short-term excitation of the speech signal as a function of a type classification selected during encoding of the first frame; and
  
  a second decoder operable to decode a second frame of the speech signal as a function of the rate selected during encoding of the second frame, the second decoder comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients of the encoded speech signal; and
  
  an excitation reconstruction module operable to reconstruct short term excitation of the speech signal absent the type classification;
  
  where the second decoder is one of a quarter-rate decoder operable at a rate of 2 kilobits per second and an eighth-rate decoder operable at a rate of 0.8 kilobits per second.
- View Dependent Claims (20, 21, 22)
- - 20. The system of claim 19, where the linear prediction coefficient reconstruction module is operable to reconstruct linear prediction coefficients as a function of the type classification when the rate selected is 8.5 kilobits per second.
  - 21. The system of claim 19, further comprising a post-processing module operable to filter and compensate the synthesized speech as a function of the rate selected.
  - 22. The system of claim 19, where the first decoder is one of a full-rate decoder operable at a rate of 8.5 kilobits per second and a half-rate decoder operable at a rate of 4 kilobits per second.

23. A method of decoding a frame of a speech signal previously encoded with a variable rate encoding system, the method comprising:
- a) reconstructing short-term excitation as a function of a bit rate and a type classification selected when the frame was encoded;
  
  b) reconstructing linear prediction coefficients as a function of the bit rate;
  
  c) generating synthesized speech as a function of the short-term excitation and the linear prediction coefficients; and
  
  d) filtering and compensating the synthesized speech as a function of the bit rate;
  
  where d) comprises performing a fine-tuning search for a pitch period of the synthesized speech as a function of the bit rate.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30)
- - 24. The method of claim 23, where d) comprises adapting filtering parameters as a function of the bit rate and long-term spectral characteristics of the speech signal.
  - 25. The method of claim 23, where d) comprises performing a fine-tuning search as a function of pitch correlation and gain controlled harmonic filtering, where at least one of the fine-tuning search and the gain controlled harmonic filtering is dependent on the bit rate.
  - 26. The method of claim 23, where b) comprises selecting one of a plurality of interpolation paths when the bit rate is a full rate and the type classification is type zero.
  - 27. The method of claim 23, where b) comprises selecting one of at least two sets of predictor coefficients when the rate is a half rate.
  - 28. The method of claim 23, where a) comprises reconstructing the short-term excitation on a subframe basis when the type classification is type zero.
  - 29. The method of claim 23, where a) comprises reconstructing short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 30. The method of claim 23, where b) comprises reconstructing the linear prediction coefficients as a function of the type classification when the rate is a full rate.

31. A variable rate speech compression system for processing a frame of a speech signal to form an encoded speech signal, the speech compression system comprising:
- means for generating a first portion of the encoded speech signal as a function of a type classification and a rate selection of the frame;
  
  means for generating a second portion of the encoded speech signal as a function of the type classification and the rate selection;
  
  means for receiving the encoded speech signal and reconstructing linear prediction coefficients for the frame as a function of the rate selection;
  
  means for receiving the encoded speech signal and reconstructing short term excitation as a function of the rate selection and the type classification of the frame; and
  
  means for assembling the short-term excitation and the linear prediction coefficients to generate synthesized speech;
  
  where the means for receiving the encoded speech signal and reconstructing the excitation is operable to reconstruct the short term excitation on a subframe basis and on a frame basis when the type classification of the frame is type one.
- View Dependent Claims (32, 33, 34, 35, 36, 37)
- - 32. The variable rate speech compression system of claim 31, where the means for generating a first portion of the encoded speech signal is operable to encode parameters of the speech signal representative of the frame.
  - 33. The variable rate speech compression system of claim 31, where the means for generating a second portion of the encoded speech signal is operable to encode parameters of the speech signal representative of each of a plurality of subframes of the frame.
  - 34. The variable rate speech compression system of claim 31, where the means for generating the first portion of the encoded speech signal comprises means for determining the rate selection and the type classification of the frame.
  - 35. The variable rate speech compression system of claim 31, where the means for generating the first portion of the encoded speech signal comprises means for pitch pre-processing the speech signal prior to generating the first portion.
  - 36. The variable rate speech compression system of claim 31, further comprising means for filtering and compensating the synthesized speech as a function of the rate selection.
  - 37. The variable rate speech compression system of claim 31, where the means for receiving the encoded speech signal and reconstructing the excitation is operable to reconstruct the short term excitation on a subframe basis when the type classification of the frame is type zero.

38. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the linear prediction coefficient reconstruction module further comprises an interpolation module when the selected bit rate is a full rate and the type classification is type zero.
- View Dependent Claims (39, 40, 41, 42, 43, 60)
- - 39. The speech compression system of claim 38, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.
  - 40. The speech compression system of claim 38, where the linear prediction coefficient reconstruction module further comprises a predictor switch module when the selected bit rate is a half rate.
  - 41. The speech compress ion system of claim 38, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis when the type classification is type zero.
  - 42. The speech compression system of claim 38, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 43. The speech compression system of claim 38, where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.
  - 60. The method of claim 38, where a) comprises reconstructing the short-term excitation on a subframe basis when the type classification is type zero.

44. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the linear prediction coefficient reconstruction module further comprises a predictor switch module when the selected bit rate is a half rate.
- View Dependent Claims (45, 46, 47, 48)
- - 45. The speech compression system of claim 44, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.
  - 46. The speech compression system of claim 44, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis when the type classification is type zero.
  - 47. The speech compression system of claim 44, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 48. The speech compression system of claim 44, where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.

49. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis when the type classification is type zero.
- View Dependent Claims (50, 51, 52)
- - 50. The speech compression system of claim 49, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.
  - 51. The speech compression system of claim 49, where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 52. The speech compression system of claim 49, where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.

53. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the excitation reconstruction module is operable to reconstruct the short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
- View Dependent Claims (54, 55)
- - 54. The speech compression system of claim 53, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.
  - 55. The speech compression system of claim 53, where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.

56. A speech compression system for processing a speech signal, the speech compression system comprising:
- a decoding system operable to receive a selected bit rate and decode the speech signal to generate synthesized speech, the decoding system comprising;
  
  a linear prediction coefficient reconstruction module operable to reconstruct linear prediction coefficients as a function of the selected bit rate;
  
  an excitation reconstruction module operable to reconstruct short-term excitation as a function of the selected bit rate and a type classification of the speech signal;
  
  a synthesis filter module operable to assemble the short-term excitation and the linear prediction coefficients to generate synthesized speech; and
  
  a post-processing module operable to filter and compensate the synthesized speech as a function of the selected bit rate;
  
  where the excitation reconstruction module comprises an adaptive codebook, a fixed codebook, a 2D/VQ gain codebook, a 3D/4D open loop VQ codebook and a 3D/4D VQ gain codebook.
- View Dependent Claims (57)
- - 57. The speech compression system of claim 56, where the post-processing module comprises a short-term filter module operable to adapt filtering parameters as a function of the selected bit rate and long-term spectral characteristics of the speech signal.

58. A method of decoding a frame of a speech signal previously encoded with a variable rate encoding system, the method comprising:
- a) reconstructing short-term excitation as a function of a bit rate and a type classification selected when the frame was encoded;
  
  b) reconstructing linear prediction coefficients as a function of the bit rate;
  
  c) generating synthesized speech as a function of the short-term excitation and the linear prediction coefficients; and
  
  d) filtering and compensating the synthesized speech as a function of the bit rate;
  
  where d) comprises performing a fine-tuning search as a function of pitch correlation and gain controlled harmonic filtering, where at least one of the fine-tuning search and the gain controlled harmonic filtering is dependent on the bit rate.
- View Dependent Claims (59, 61, 62)
- - 59. The method of claim 58, where d) comprises adapting filtering parameters as a function of the bit rate and long-term spectral characteristics of the speech signal.
  - 61. The method of claim 58, where a) comprises reconstructing short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
  - 62. The method of claim 58, where b) comprises reconstructing the linear prediction coefficients as a function of the type classification when the rate is a full rate.

63. A method of decoding a frame of a speech signal previously encoded with a variable rate encoding system, the method comprising:
- a) reconstructing short-term excitation as a function of a bit rate and a type classification selected when the frame was encoded;
  
  b) reconstructing linear prediction coefficients as a function of the bit rate;
  
  c) generating synthesized speech as a function of the short-term excitation and the linear prediction coefficients; and
  
  d) filtering and compensating the synthesized speech as a function of the bit rate;
  
  where a) comprises reconstructing the short-term excitation on a subframe basis when the type classification is type zero.
- View Dependent Claims (64, 65, 66)
- - 64. The method of claim 63, where d) comprises adapting filtering parameters as a function of the bit rate and long-term spectral characteristics of the speech signal.
  - 65. The method of claim 63, where a) comprises reconstructing short-term excitation on a subframe basis and on a frame basis hen the type classification is type one.
  - 66. The method of claim 63, where b) comprises reconstructing the linear prediction coefficients as a function of the type classification when the rate is a full rate.

67. A method of decoding a frame of a speech signal previously encoded with a variable rate encoding system, the method comprising:
- a) reconstructing short-term excitation as a function of a bit rate and a type classification selected when the frame was encoded;
  
  b) reconstructing linear prediction coefficients as a function of the bit rate;
  
  c) generating synthesized speech as a function of the short-term excitation and the linear prediction coefficients; and
  
  d) filtering and compensating the synthesized speech as a function of the bit rate;
  
  where a) comprises reconstructing short-term excitation on a subframe basis and on a frame basis when the type classification is type one.
- View Dependent Claims (68, 69)
- - 68. The method of claim 67, where d) comprises adapting filtering parameters as a function of the bit rate and long-term spectral characteristics of the speech signal.
  - 69. The method of claim 67, where b) comprises reconstructing the linear prediction coefficients as a function of the type classification when the rate is a full rate.

70. A method of decoding a frame of a speech signal previously encoded with a variable rate encoding system, the method comprising:
- a) reconstructing short-term excitation as a function of a bit rate and a type classification selected when the frame was encoded;
  
  b) reconstructing linear prediction coefficients as a function of the bit rate;
  
  c) generating synthesized speech as a function of the short-term excitation and the linear prediction coefficients; and
  
  d) filtering and compensating the synthesized speech as a function of the bit rate;
  
  where b) comprises reconstructing the linear prediction coefficients as a function of the type classification when the rate is a full rate.
- View Dependent Claims (71)
- - 71. The method of claim 70, where d) comprises adapting filtering parameters as a function of the bit rate and long-term spectral characteristics of the speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Wi-LAN Inc.
Original Assignee
Mindspeed Technologies Inc. (MACOM Technology Solutions Holdings, Inc.)
Inventors
Thyssen, Jes, Shlomot, Eyal, Gao, Yang, Benyassine, Adil, Su, Huan-yu
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
NOLAN, DANIEL A

Application Number

US10/409,430
Publication Number

US 20030200092A1
Time in Patent Office

399 Days
Field of Search

704/258, 704/201, 704/208, 704/223, 704/230, 704/265, 704/219, 704/222
US Class Current

704/258
CPC Class Codes

G10L 19/00   Speech or audio signals ana...

G10L 19/167   Audio streaming, i.e. forma...

G10L 19/24   Variable rate codecs, e.g. ...

H03G 3/00   Gain control in amplifiers ...

Encoding and decoding speech signals variably based on signal classification

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

159 Citations

71 Claims

Specification

Solutions

Use Cases

Quick Links

Encoding and decoding speech signals variably based on signal classification

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

159 Citations

71 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links