Improving sub-band coding of speech at low bit rates by adding residual speech energy signals to sub-bands

US 4,956,871 A
Filed: 09/30/1988
Issued: 09/11/1990
Est. Priority Date: 09/30/1988
Status: Expired due to Term

First Claim

Patent Images

1. An arrangement for coding speech comprising:

means for receiving a speech pattern;

means connected to the speech pattern receiving means for sampling the received speech pattern at a predetermined rate corresponding to the speech pattern bandwidth;

neans having an input connected to the speech pattern sampling means and a plurality of outputs for dividing the spectrum of the speech pattern samples into a plurality of sub-band portions;

means connected to each spectrum dividing means output for reducing the sampling rate of the sub-band speech samples therefrom to a lower sampling rate then the predetermined rate;

means connected to each reducing means ouptput for grouping the sub-band speech samples from the spectrum dividing means into successive time frame intervals of K+1 speech samples;

means connected to the grouping means responsive to the K+1 sub-band speech samples of the present time frame interval for generating a signal representative of the energy of the sub-band speech of the time frame interval in each sub-band;

means responsive to the speech energy signals of a plurality of the sub-band portions for generating a set of signals allocating a predetermined number of bits to each sub-band portion;

means for coding each sub-band speech portion of the present time frame interval including means jointly responsive to the speech energy signal and the bit allocation signal of the sub-band portion for quantizing the sequence of K+1 speech samples of the sub-band;

means for generating a sequence of signals representative of the residual differences between the sub-band speech samples of the time frame interval and the corresponding quantized sub-band speech samples;

means connected to the speech energy signal generating means, the bit allocation generating means and the residual difference signal generating means responsive to the energy representive signals and the bit allocation signals of a plurality of the sub-bands for selecting at least one of the plurality of sub-bands for encoding the time frame interval residual difference signal sequence;

means connected to the sub-band selecting means for generating a coded signal representative of the sequence of residual difference signals of the at least one selected sub-band of the time frame interval; and

means for multiplexing the quantized speech samples, the energy signals of the sub-band portions and coded signal representing the residual difference of the at least one selected sub-band portion to form a coded signal representative of the speech pattern of the time frame interval.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A sub-band speech coding arrangement divides the speech spectrum into sub-bands and allocates bits to encode the time frame interval samples of each sub-band responsive to the speech energies of the sub-bands. The sub-band samples are quantized according to the sub-band energy bit allocation and the time frame quantized samples and speech energy signals are coded. A signal representative of the residual difference between the each time frame interval speech sample of the sub-band and the corresponding quantized speech sample of the sub-band is generated. The quality of the sub-band coded signal is improved by selecting the sub-bands with the largest residual differences, producing a vector signal from the sequence of residual difference signals of each selected sub-band, and matching the sub-band vector signal to one of a set of stored Gaussian codebook entries to generate a reduced bit code for the selected vector signal. The coded time frame interval quantized signals, speech energy signals and reduced bit codes for the selected residual differences are combined to form a multiplexed stream for the speech pattern of the time frame interval.

Citations

22 Claims

1. An arrangement for coding speech comprising:
- means for receiving a speech pattern;
  
  means connected to the speech pattern receiving means for sampling the received speech pattern at a predetermined rate corresponding to the speech pattern bandwidth;
  
  neans having an input connected to the speech pattern sampling means and a plurality of outputs for dividing the spectrum of the speech pattern samples into a plurality of sub-band portions;
  
  means connected to each spectrum dividing means output for reducing the sampling rate of the sub-band speech samples therefrom to a lower sampling rate then the predetermined rate;
  
  means connected to each reducing means ouptput for grouping the sub-band speech samples from the spectrum dividing means into successive time frame intervals of K+1 speech samples;
  
  means connected to the grouping means responsive to the K+1 sub-band speech samples of the present time frame interval for generating a signal representative of the energy of the sub-band speech of the time frame interval in each sub-band;
  
  means responsive to the speech energy signals of a plurality of the sub-band portions for generating a set of signals allocating a predetermined number of bits to each sub-band portion;
  
  means for coding each sub-band speech portion of the present time frame interval including means jointly responsive to the speech energy signal and the bit allocation signal of the sub-band portion for quantizing the sequence of K+1 speech samples of the sub-band;
  
  means for generating a sequence of signals representative of the residual differences between the sub-band speech samples of the time frame interval and the corresponding quantized sub-band speech samples;
  
  means connected to the speech energy signal generating means, the bit allocation generating means and the residual difference signal generating means responsive to the energy representive signals and the bit allocation signals of a plurality of the sub-bands for selecting at least one of the plurality of sub-bands for encoding the time frame interval residual difference signal sequence;
  
  means connected to the sub-band selecting means for generating a coded signal representative of the sequence of residual difference signals of the at least one selected sub-band of the time frame interval; and
  
  means for multiplexing the quantized speech samples, the energy signals of the sub-band portions and coded signal representing the residual difference of the at least one selected sub-band portion to form a coded signal representative of the speech pattern of the time frame interval.
- View Dependent Claims (2, 3, 4, 5)
- - 2. An arrangement for coding speech according to claim 1 wherein said means for generating the coded signal representative of the residual difference signals of the at least one selected sub-band of the present time frame interval comprises:
    - means for storing a plurality of fixed codes each having K+1 elements;
      
      means responsive to the sequence of residual different signals of the selected sub-band for forming a vector signal having k+1 elements, each element corresponding to one of the residual difference signals of the selected sub-band portion;
      
      means for identifying the stored fixed code most closely matching the selected sub-band vector signal; and
      
      means for applying a coded signal corresponding to the identified fixed code to the multiplexing means.
  - 3. An arrangement for coding speech according to claim 2 wherein each component of the stored fixed code is a Gaussian code having zero mean and unity variance.
  - 4. An arrangement for coding speech according to claim 2 wherein the means for selecting at least one sub-band for enncoding the residual difference signals comprises:
    - means responsive to the bit allocation signal and the speech energy signal of each sub-band for generating a signal representative of an estimate of the residual difference of the sub-band; and
      
      means responsive to the residual difference estimate signals of the time frame interval for selecting at least the sub-band having the maximum residual difference estimate signal.
  - 5. An arrangement for coding speech according to claim 4 wherein the means for identifying the stored fixed code closely matching the selected sub-band vector signal comprises:
    - means for comparing the K+1 element residual difference vector signal of the at least one selected sub-band to each stored K+1 element fixed code representing individual residual difference estimate signals of the selected sub-band; and
      
      means responsive to the output of the comparing means for selecting the stored fixed code with the minimum difference between the element fixed code and the residual difference vector signal of the selected sub-band; and
      
      means responsive to the fixed code selecting means for generating an index code identifying the selected fixed code.

6. A method of coding speech comprising:
- receiving a speech pattern;
  
  sampling the received speech pattern at a predetermined rate corresponding the speech pattern bandwidth;
  
  dividing the spectrum of the speech pattern samples into a plurality of sub-band portions;
  
  reducing the sampling rate of the sub-band speech samples to a sampling rate lower than the predetermined rate;
  
  grouping the sub-band speech samples from the reducing means into successive time frame intervals of K+1 speech samples;
  
  generating a signal representative of the energy of the sub-band speech of the time frame interval in each sub-band responsive of the K+1 sub-band speech samples of the present time frame interval;
  
  generating a set of signals allocating a predetermined number of bits to each sub-band portion means responsive to the speech energy signals of a plurality of the sub-band portions;
  
  coding each sub-band speech portion of the present time frame interval including quantizing the sequence of K+1 speech samples of the sub-band jointly responsive to the speech energy signal and the bit allocation signal of the sub-band portion;
  
  generating a sequence of signals representative of the residual differences between the sub-band speech samples of the time frame interval and the corresponding quantized sub-band speech samples;
  
  selecting at least one sub-band for encoding the time frame interval residual difference signals thereof responsive to the energy representative signals and the bit allocation signals of a plurality of the sub-bands;
  
  generating a coded signal representative of the sequence of residual difference signals of the at least one selected sub-band of the time frame interval; and
  
  multiplexing the quantized speech samples, the energy signals of the sub-band portions and coded signal representing the residual difference of the at least one selected sub-band portions to form a coded signal representative of the speech pattern of the time frame interval.
- View Dependent Claims (7, 8, 9, 10)
- - 7. A method of coding speech according to claim 6 wherein the step of generating the coded signal representative of the residual difference signals of the at least one selected sub-band of the present time frame interval comprises:
    - storing a plurality of fixed codes each having K+1 elements;
      
      forming a vector signal having K+1 elements, each element corresponding to one of the residual difference signals of the selected sub-band portion responsive to the residual difference signals of the selected sub-band;
      
      identifing the stored fixed code most closely matching the selected sub-band vector signal; and
      
      using the identified fixed code as the residual difference coded signal.using the identified fixed code as the residual difference coded signal.
  - 8. A method of coding speech according to claim 7 wherein each component of the stored fixed code is a Gaussian code having zero mean and unity variance.
  - 9. A method of coding speech according to claim 8 wherein the step of selecting at least one sub-band for encoding the residual difference signals comprises:
    - generating a signal representative of an estimate of the residual difference of the sub-band reponsive to the bit allocation signal and the speech energy signal of each sub-band; and
      
      selecting at least the sub-band having the maximum residual difference estimate signal responsive to the residual difference estimate signals of the time frame interval.
  - 10. A method of coding speech according to claim 9 wherein the step of identifying the stored fixed code most closely matching the selected sub-band vector signal comprises:
    - comparing the K+1 element residual difference vector signal of the at least one selected sub-band to each stored K+1 element fixed code representing individual residual difference estimate signals of the selected sub-band; and
      
      selecting the stored fixed code with the minimum difference between the element fixed code and the residual difference vector signal of the selected sub-band responsive to the output of the comparing step; and
      
      generating an index code identifying the selected fixed code responsive to the code selecting step.

11. An arrangement for decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of a sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern comprising:
- means responsive to the speech energy signals of the plurality of sub-band portions of the time frame interval for generating a set of signals allocating a predetermined number of bits to each sub-band portion of the time frame interval;
  
  means jointly responsive to the time frame interval speech energy signals and the bit allocation signals of a sub-band portion for converting the sub-band coded quantized signals into a sequence of restored replicas of sub-band quantized samples for the sub-band portion;
  
  means responsive to the speech energy signals and the bit allocation signals of a plurality of sub-bands for determining at least one selected sub-band portion corresponding to one of the set of coded residual difference representative signals;
  
  means responsive to the coded residual difference representative signals of the selected sub-band for generating the sequence of K+1 residual difference signals of the selected sub-band,means for combining the sequence of K+1 residual difference signals of the selected sub-band and the sequence of K+1 quantized samples of the selected sub-band to form a sequence of K+1 signals representative of the speech pattern sub-band sample signals;
  
  means for increasing the sample rate of the speech pattern sub-band sample signals to twice a bandwidth of the spectrum of an originating pretransmission speech pattern;
  
  means for restricting the spectrum of the increased sample rate sub-band quantized samples to its sub-band portion; and
  
  means for combining the spectrum restricted sub-band sample signals of the plurality of sub-bands to form a replica of the speech pattern of the time frame interval.
- View Dependent Claims (12, 13, 14)
- - 12. An arrangement for decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 11 wherein the means for determining at least one selected sub-band portion corresponding to one of the set of coded residual difference representative signals comprises:
    - means responsive to the bit allocation and speech energy signals of the time frame interval for forming a set of signals representing an estimate of the time frame interval residual differences of each sub-band; and
      
      means for selecting at least one of the sub-bands having the largest residual difference estimate signal for the time frame interval.
  - 13. An arrangement for decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 12 wherein the means for generating the sequence of K+1 residual difference signals of the selected sub-band comprises:
    - means for storing a plurality of fixed codes each having K+1 elements;
      
      means responsive to the coded signal corresponding to the selected sub-band for selecting one of the plurality of K+1 element fixed codes;
      
      means for scaling the selected fixed code with the residual difference estimate signal of the selected sub-band to form the sequence of K+1 residual difference signals for the selected sub-band.
  - 14. An arrangement for decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representive of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 13 wherein each fixed code element is a Gaussian code having zero means and unit variance.

15. A method of decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of a sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern comprising;
- generating a set of signals allocating a predetermined number of bits to each sub-band portion of a the time frame interval responsive to the speech energy signals of the plurality of sub-band portions of the time frame interval;
  
  converting the sub-band coded quantized signals into a sequence of restored replicas of sub-band quantized samples for the sub-band portion jointly responsive to the time frame interval speech energy and the bit allocation signals of a sub-band portion;
  
  determining at least one selected sub-band portion corresponding to one of the set of coded residual difference representative signals responsive to the speech energy signals and the bit allocation signals of a plurality of sub-bands;
  
  generating the sequence of K+1 residual difference signals of the selected sub-band responsive to the coded residual difference representative signals of the selected sub-band;
  
  combining the sequence of K+1 residual difference signals of the selected sub-band and the sequence of K+1 quantized samples of the selected sub-band to form a sequence of K+1 signals representative of the speech pattern sub-band sample signals;
  
  increasing the sample rate of the speech pattern sub-band sample signals to twice a bandwidth of the spectrum of an originating pretransmission speech pattern;
  
  restricting the spectrum of the increased sample rate sub-band quantized samples to its sub-band portion; and
  
  combining the spectrum restricted sub-band sample signals of the plurality of sub-bands to form a replica of the speech pattern of the time frame interval.
- View Dependent Claims (16, 17, 18)
- - 16. A method of decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 15 wherein the step of determining at least one selected sub-band portion corresponding to one of the set of coded residual difference representative signals comprises:
    - forming a set of signals representing an estimate of the time frame interval residual differences of each sub-band responsive to the bit allocation and speech energy signals of the time frame interval; and
      
      selecting at least one of the sub-bands having the largest residual difference estimate signal for the time frame interval.
  - 17. A method of decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 16 wherein the step of generating the sequence of K+1 residual difference signals of the selected sub-band comprises:
    - storing a plurality of fixed codes each having K+1 elements;
      
      selecting one of the plurality of K+1 element fixed codes responsive to the coded signal corresponding to the selected sub-band;
      
      scaling the selected fixed code with the residual difference estimate signal of the selected sub-band to form the sequence of K+1 residual difference signals for the selected sub-band.
  - 18. A method of decoding a signal representative of a speech pattern of a time frame interval wherein the signal includes a set of coded signals each representative of the sequence of K+1 quantized samples of one of a plurality of sub-band portions of the spectrum of the speech pattern, a set of signals each representative of the speech energy of one of the plurality of sub-band portions, and a set of coded signals each representative of residual difference signals of a selected sub-band portion of the spectrum of the speech pattern according to claim 17 wherein each fixed code element is a Gaussian code having zero mean and unit variance.

19. In a sub-band coder for processing a speech pattern having a prescribed bandwidth including, means for separating the spectrum of the speech pattern into a plurality of sub-band portions, means for sampling each sub-band portion at a predetermined rate, means for partitioning the sequence of samples of each sub-band portion into successive frame intervals, means for forming a signal representative of the speech energy of each sub-band portion of the time frame interval and means for supplying a signal allocating a prescribed number of bits to each sub-band of the time frame interval, means for coding the sequence of samples of each sub-band portion of the time frame interval into a sequence quantized digital signals in accordance with the bit allocation and speech energy signals of the sub-band portion means for forming a sequence of signals each representative of the residual difference between each sample and the quantized digital signal corresponding thereto, and means for combining the quantized digital signals of the plurality of sub-band portions and the speech energy representative signals into a coded signal representative of the time frame portion of the speech pattern wherein the improvement comprises:
- means responsive to the speech energy representative and bit allocation signals of a plurality of said sub-band portions for selecting at least one sub-band to encode the sequence of residual difference signals;
  
  means for forming a codes signal representative of the sequence of residual difference signals of the at least one selected sub-band portion; and
  
  means for adding the coded residual difference representative signal to the coded signal representative of the time frame portion of the speech pattern.
- View Dependent Claims (20, 21, 22)
- - 20. In a sub-band coder according to claim 19 wherein the sub-band selecting means comprises:
    - means responsive to the time frame interval speech energy and the bit allocation signals of each sub-band for generating a signal representative of an estimate of the speech energy of the residual difference signals of the sub-band; and
      
      means responsive to the residual difference estimate signals of the plurality of sub-bands for determining the at least one sub-band having the largest residual difference estimate signal.
  - 21. In a sub-band coder according to claim 20 wherein means for forming a coded signal representative of the sequence of residual difference signals of the least one selected sub-band portion comprises:
    - means responsive to the residual difference signals of the selected sub-band for forming a vector signal having K+1 components, each component corresponding to one of the residual difference signals of the selected sub-band portion;
      
      means for storing a plurality of fixed codes, each code having K+1 elements;
      
      means for determining the fixed code having the closest similarity with the residual difference vector signal of the selected sub-band; and
      
      means for identifying the determined fixed code.
  - 22. In a sub-band coder according to claim 21 wherein each fixed code element is a Gaussian code having zero mean and unit variance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
American Telephone & Telegraph Company (AT&T, Inc.), Bell Telephone Laboratories, Inc. (Nokia Corporation)
Original Assignee
AT&T, Inc.
Inventors
Swaminathan, Kumar
Primary Examiner(s)
NOT, DEFINED
Assistant Examiner(s)
NOT, DEFINED

Application Number

US07/252,250
Time in Patent Office

711 Days
Field of Search

381/29-40, 381/41, 364/513.5, 375/122, 375/25-26, 375/34
US Class Current

704/229
CPC Class Codes

H04B 1/667 using a division in frequen...

Improving sub-band coding of speech at low bit rates by adding residual speech energy signals to sub-bands

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Improving sub-band coding of speech at low bit rates by adding residual speech energy signals to sub-bands

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links