Speech compressor using trellis encoding and linear prediction

US 5,659,659 A
Filed: 06/18/1996
Issued: 08/19/1997
Est. Priority Date: 07/26/1993
Status: Expired due to Term

First Claim

Patent Images

1. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:

a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames;

interpolation means for interpolating said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame;

ringing removal and perceptual weighting means for ringing removal and perceptual weighting said speech signals to produce predistorted speech vectors for successive subframes;

a long term prediction analyzer means coupled to said ringing removal and perceptual weighting means to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer means for generating long term prediction parameters and a scaled pitch component for the successive subframes;

pitch removal means for removing scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes;

trellis decoder means coupled to said pitch removal means to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level, said trellis decoder means for generating trellis gain and trellis path indexes for the successive subframes;

a trellis encoder storage for storing a predetermined trellis structure and list of trellis edge subblocks; and

a trellis encoder means coupled to said trellis decoder means to receive said trellis path indexes, said trellis encoder means for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks stored in said trellis encoder storage.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech compressor utilizing Trellis Encoding and Linear Prediction (TELP). A TELP speech compressor provides improved signal generation and search technique for a code-excited linear prediction (CELP) speech encoder. TELP is a frame oriented coding that breaks the quantized speech signals into frames of prescribed length N and each frame into subframes of prescribed length L, which are processed as dependent units utilizing an analysis-by-synthesis approach. The approach is based on constructing the best mean square linear predicting filter and searching the best exciting sequence for the filter in order to produce synthesized speech. A trellis encoder is used instead of a stochastic code book. The Q-ary analysis of a given subframe and previous excitations is proposed for a fast vector search in an adaptive code book. It simplifies the implementation of digital speech compression.

56 Citations

View as Search Results

17 Claims

1. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:
- a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames;
  
  interpolation means for interpolating said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame;
  
  ringing removal and perceptual weighting means for ringing removal and perceptual weighting said speech signals to produce predistorted speech vectors for successive subframes;
  
  a long term prediction analyzer means coupled to said ringing removal and perceptual weighting means to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer means for generating long term prediction parameters and a scaled pitch component for the successive subframes;
  
  pitch removal means for removing scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes;
  
  trellis decoder means coupled to said pitch removal means to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level, said trellis decoder means for generating trellis gain and trellis path indexes for the successive subframes;
  
  a trellis encoder storage for storing a predetermined trellis structure and list of trellis edge subblocks; and
  
  a trellis encoder means coupled to said trellis decoder means to receive said trellis path indexes, said trellis encoder means for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks stored in said trellis encoder storage.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A trellis excited linear predictive coder as recited in claim 1, wherein said trellis decoder means is further comprised of:
    - edge response generator means for generating decoder synthesis filter responses for said trellis edge subblocks at successive trellis levels;
      
      edge energy generating means coupled to said edge response generator means to receive said decoder synthesis filter responses, said edge energy generation means for generating the energy values for edges for the successive trellis levels;
      
      edge correlation generation means coupled to said edge response generator means to receive said decoder synthesis filter responses and said trellis edge subblocks, said edge correlation generation means for generating correlation values for edges of successive trellis levels;
      
      edge energy accumulator means coupled to said edge energy generating means to receive said energy values for edges, said edge energy accumulator means for accumulating energy values for edges for the successive trellis levels,edge correlation accumulator means coupled to said edge correlation generation means to receive said correlation values for edges, said edge correlation accumulator means for accumulating the correlation values for edges for the successive trellis levels;
      
      arithmetic trellis unit means coupled to said edge energy accumulator means and edge correlation accumulator means to receive said accumulated energy values and said accumulated correlation values, said arithmetic trellis unit means for generating survived transition indexes for trellis states in the successive trellis levels and for generating the trellis gain values for the successive subframes; and
      
      path memory means coupled to said arithmetic trellis unit to receive said survived transition indexes, said path memory means for generating the path indexes for the successive subframes.
  - 3. A trellis excited linear predictive coder as recited in claim 2, wherein said edge response generator means is further comprised of:
    - decoder synthesis filter means coupled to said trellis encoder storage for receiving said trellis edges subblocks, said decoder synthesis filter means for generating edge response vectors for the successive subframes;
      
      edge response memory means for storing said edge response vectors for the successive subframes;
      
      path response memory means for storing the path response vectors for each trellis state wherein each of said path response vectors is generated from a previously stored vector from the path response memory and a vector from the edge response memory; and
      
      addition means coupled to said edge response memory and said path response memory to receive said path response vectors and said edge response vectors, said addition means for generating decoder synthesis filter responses for the successive trellis levels.
  - 4. A trellis excited linear predictive coder as recited in claim 1, wherein said long term prediction analyzer means is further comprised of:
    - adaptive code book (ACB) storage means for storing a plurality of ACB entries;
      
      ACB index generation means for generating a list of ACB indexes for each of the successive subframes;
      
      ACB means coupled to said ACB index generation means to receive said ACB indexes, said ACB means for generating ACB excitation vectors for said ACB indexes, said ACB excitation vectors produced from an entry of said ACB storage, said ACB storage means updated by the excitation vectors for the successive subframes;
      
      a first perceptual synthesis filtering (PSF) means coupled to said ACB means to receive said ACB excitation vectors, said first PSF means for producing filtered vectors for the successive subframes;
      
      ACB subframe energy calculation means coupled to said first PSF means to receive said filtered vectors, said ACB subframe energy calculation means for calculating energy values for said filtered vectors;
      
      ACB subframe correlation calculation means coupled to said first PSF means and said ringing removal and perceptual weighting means to receive said filtered vectors and said predistorted speech vectors, said ACB subframe correlation calculation means for calculating correlation values for said filtered vectors;
      
      ACB arithmetic unit means coupled to said ACB subframe energy calculation means said ACB subframe correlation calculation means and said ACB index generation means to receive energy values, correlation values for said filtered vectors and a list of ACB indexes, said ACB arithmetic unit means for computing ACB indexes and ACB gain values for the successive subframes; and
      
      ACB output buffer means for outputting ACB excitation vectors related to said ACB indexes for the successive subframes.
  - 5. A trellis excited linear-predictive coder as recited in claim 4, wherein said ACB index generator means is further comprised of:
    - a second perceptual synthesis filter (PSF) means coupled to said ACB means to receive said ACB contents, said second PSF means for producing a filtered ACB sequence for each of the successive subframes;
      
      first quantizing means coupled to said second PSF means to receive a first filtered ACB sequence, said quantizing means for producing a quantized filtered ACB sequence for each of the successive subframes;
      
      Q-ary adaptive code book (QACB) means coupled to said first quantizing means, said QACB means for generating QACB vectors for said ACB indexes wherein said QACB vectors are generated from said quantized filtered ACB sequence for each of the successive frames;
      
      weighting means to said QACB means to receive QACB vectors, said weighting means for generating weighted QACB vectors for the successive subframes;
      
      second quantizing means coupled to said ringing removal and perceptual weighting means to receive said predistorted speech vectors, said second quantizing means for computing quantized predistorted speech vectors for the successive subframes;
      
      quantized energy calculation means coupled to said weighting means to receive said weighted QACB vectors, said quantized energy calculation means for computing quantized energy values for QACB vectors for each of the successive subframes;
      
      quantized correlation calculation means coupled to said weighting means and said second quantizing means to receive said weighted QACB vectors and said quantized predistorted speech vectors, said quantized correlation calculation means for computing quantized correlation values for QACB vectors for each of the successive subframes;
      
      QACB arithmetic unit means coupled to said quantized energy calculation means and said quantized correlation calculation means to receive said quantized correlation values and quantized energy values for QACB vectors, said QACB arithmetic unit means for computing said lists of ACB indexes for the successive subframes; and
      
      index memory means for generation of said lists of ACB indexes for the successive subframes.
  - 6. A trellis excited linear predictive coder as recited in claim 4 further comprising:
    - ACB arithmetic unit means for evaluating an ACB efficiency parameter for the successive subframes; and
      
      a long term prediction analyzer and trellis decoder adjustment means coupled to said ACB arithmetic unit means to receive said ACB efficiency parameter, said long term prediction analyzer and trellis decoder adjustment means for analyzing and adjusting said speech coder performance.

7. A trellis excited linear predictive coding method for processing digital speech signals, said digital speech signals partitioned into frames of a first predetermined length, each frame partitioned into subframes of a second predetermined length, each subframe partitioned into a third predetermined number of subblocks of a fourth length, said method comprising the steps of:
- (a) performing a linear predictive analysis of an input digital speech signal to create frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames;
  
  (b) interpolating said frame linear prediction parameters to create subframe linear prediction parameters for successive subframes;
  
  (c) generating predistorted speech vectors for each of the successive subframes of said input digital speech signal;
  
  (d) performing long term prediction analysis of said predistorted speech vector for determination of long term prediction parameters and for generating a scaled pitch component for each of the successive subframes;
  
  (e) removing the scaled pitch component from said predistorted speech vector to produce decoder input vector u for each of the successive subframes;
  
  (f) trellis decoding said decoder input vector, said decoder input vector partitioned into a succession of speech subblocks u=(u₁, u₂, . . . , u_t, . . . , u_l), where the speech subblock u_t,1<
  
  t<
  
  l, is processed at the trellis level t, for generating trellis gain g_T and trellis path index I_T for each of the successive subframes;
  
  (g) said g_t and I_t identifying an excitation vector which is being used as an excitation for the decoder synthesis filter (DSF) and which produces a synthesized vector approximating in a predefined sense decoder input vector u; and
  
  (h) trellis encoding said trellis path index for generating a trellis code word for each of the successive subframes according to a predetermined trellis structure and a list of trellis edge subblocks stored in a trellis code book.
- View Dependent Claims (8, 9)
- - 8. A trellis decoding method for decoding coded speech signals encoded using the method recited in claim 7, said decoding method comprising the steps of:
    - (a) initializing at the level 0, the values used for trellis decoding, including the DSF memory and values of accumulated correlation AC_o,s and accumulated energy AE_o,s for each trellis state s, 1<
      
      s<
      
      M;
      
      (b) performing a trellis search for given input vector;
      
      u=(u₁, u₂, . . . , u_t, . . . , u_l) at successive level 1,2, . . . , l, wherein said trellis search at the level t comprising the steps of;
      
      (b1) search for each trellis state i, 1<
      
      i<
      
      M, the survived edge j for said state i, terminating at said state i, where said survived edge is being taken from a set Edges(t,i), comprising the steps of;
      
      (b2) generating the DSF response b_j for each edge j from the set Edges (t,i), where said DSF response b_j is being generated by using the contents of the filter memory for the initial state s'"'"' of said edge j;
      
      (b3) computing the energy value for the edge j;
      
      (b4) computing the correlation value for the edge j;
      
      (b5) computing the survived edge at the state s as an edge j from the set Edges (t,i) for the level t which provides a maximum for a match function based on an accumulated correlation and an accumulated energy for the initial state s'"'"' of the edge j;
      
      (c) storing the transition index ^I_t of the survived edge i in the path memory;
      
      (d) modifying the accumulated correlation and accumulated energy values for each trellis state s, 1<
      
      s<
      
      M;
      
      (e) modifying the contents of the DSF memory for the state s, by using the excitation from the edge j survived at a said state s;
      
      (f) determining a survived state s of level l and, by addressing the paths memory, selecting the survived path which is formed by the sequence of survived edges terminating at the survived state s;
      
      (g) computing a trellis path index, I_T identifying said survived path; and
      
      (h) computing a trellis gain g_T based on said accumulated correlation and said accumulated energy for a survived state s of level l.
  - 9. A trellis decoding method as recited in claim 8, wherein determining the survived state of level l comprises calculating for each state s of the trellis level a match function and selecting the state s, which provides the maximum value for said match function as the survived state of level l.

10. A trellis excited linear predictive synthesizer for generating synthesized speech signals from a binary stream, said binary stream comprising encoded successive subframes of encoded speech signals, each of said successive subframes including an adaptive code book (ACB) index value, an ACB gain value, a trellis code book index value, a trellis code book gain value and a side information parameter for successive subframes, said trellis excited linear predictive synthesizer comprising:
- a parsing means for receiving a binary stream and parsing out component parts of encoded successive subframes;
  
  pitch generation means for generating a scaled ACB pitch excitation signal from said adaptive code book index value, said adaptive code book gain value and side information parameter for successive subframes,trellis code word generation means for generating scaled trellis code words from said trellis code book index value, said trellis code book gain value and said side information parameter;
  
  combining means for combining said scaled trellis code words with said scaled ACB pitch excitation signal to create an excitation vector for a processed subframe; and
  
  a linear synthesis filter means coupled to said combining means, said linear synthesis filter means for transforming an excitation vector into a synthesized speech signal.
- View Dependent Claims (11)
- - 11. The trellis excited linear productive synthesizer as recited in claim 10 wherein said trellis code word generation means is further comprised of a trellis encoder and a trellis code book.

12. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:
- a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames;
  
  an interpolation module configured to interpolate said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame;
  
  a ringing removal and perceptual weighting unit configured to produce predistorted speech vectors for successive subframes;
  
  a long term prediction analyzer coupled to said ringing removal and perceptual weighting unit to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer for generating long term prediction parameters and a scaled pitch component for the successive subframes;
  
  a feedback loop configured to remove scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes;
  
  a trellis decoder for generating trellis gain and trellis path indexes for the successive subframes, said trellis decoder coupled to said feedback loop to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level;
  
  a trellis encoder storage having stored therein a predetermined trellis structure and list of trellis edge subblocks; and
  
  a trellis encoder coupled to said trellis decoder to receive said trellis path indexes, said trellis encoder for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks.
- View Dependent Claims (13, 14, 15)
- - 13. A trellis excited linear predictive coder as recited in claim 12, wherein said trellis decoder is further comprised of:
    - an edge response generator configured to generate decoder synthesis filter responses for said trellis edge subblocks at successive trellis levels;
      
      an edge energy unit coupled to said edge response generator to receive said decoder synthesis filter responses, said edge energy unit configured to generate the energy values for edges for the successive trellis levels;
      
      an edge correlation unit coupled to said edge response generator to receive said decoder synthesis filter responses and said trellis edge subblocks, said edge correlation unit configured to produce correlation values for edges of successive trellis levels;
      
      an edge energy accumulator coupled to said edge energy unit to receive said energy values for edges, said edge energy accumulator for accumulating energy values for edges for the successive trellis levels,an edge correlation accumulator coupled to said edge correlation unit to receive said correlation values for edges, said edge correlation accumulator for accumulating the correlation values for edges for the successive trellis levels;
      
      an arithmetic trellis unit coupled to said edge energy accumulator and edge correlation accumulator to receive said accumulated energy values and said accumulated correlation values, said arithmetic trellis unit configured to generate survived transition indexes for trellis states in the successive trellis levels and for generating the trellis gain values for the successive subframes; and
      
      a path memory unit coupled to said arithmetic trellis unit to receive said survived transition indexes, said path memory unit configured to output the path indexes for the successive subframes.
  - 14. A trellis excited linear predictive coder as recited in claim 12, wherein said long term prediction analyzer is further comprised of:
    - an adaptive code book (ACB) storage for storing a plurality of ACB entries;
      
      an ACB index generator configured to generate a list of ACB indexes for each of the successive subframes;
      
      an ACB coupled to said ACB index generator to receive said ACB indexes, said ACB configured to produce ACB excitation vectors for said ACB indexes, said ACB excitation vectors produced from an entry of said ACB storage, said ACB storage updated by the excitation vectors for the successive subframes;
      
      a first perceptual synthesis filter (PSF) coupled to said ACB to receive said ACB excitation vectors, said first PSF for producing filtered vectors for the successive subframes;
      
      an ACB subframe energy calculation unit coupled to said first PSF to receive said filtered vectors, said ACB subframe energy calculation unit for calculating energy values for said faltered vectors;
      
      an ACB subframe correlation calculation unit coupled to said first PSF and said feedback loop to receive said filtered vectors and said predistorted speech vectors, said ACB subframe correlation calculation unit for calculating correlation values for said filtered vectors;
      
      an ACB arithmetic unit coupled to said ACB subframe energy calculation unit said ACB subframe correlation calculation unit and said ACB index generator to receive energy values, correlation values for said filtered vectors and a list of ACB indexes, said ACB arithmetic unit for computing ACB indexes and ACB gain values for the successive subframes; and
      
      an ACB output buffer for outputting ACB excitation vectors related to said ACB indexes for the successive subframes.
  - 15. A trellis excited linear predictive coder as recited in claim 14 further comprising:
    - a long term prediction analyzer and trellis decoder adjustment unit coupled to said ACB arithmetic unit to receive an efficiency parameter, said long term prediction analyzer and trellis decoder adjustment unit for analyzing and adjusting said speech coder performance;
      
      wherein said ACB arithmetic unit evaluates said efficiency parameter for the successive subframes.

16. A trellis excited linear predictive synthesizer for generating synthesized speech signals from a binary stream, said binary stream comprising encoded successive subframes of encoded speech signals, each of said successive subframes including an adaptive code book (ACB) index value, an ACB gain value, a trellis code book index value, a trellis code book gain value and a side information parameter for successive subframes, said trellis excited linear predictive synthesizer comprising:
- a parsing unit configured to receive a binary stream, said parsing unit parsing out component parts of encoded successive subframes;
  
  a pitch generator configured to produce a scaled ACB pitch excitation signal from said ACB index value, said ACB gain value and said side information parameter for successive subframes,a trellis code word unit configured to generate scaled trellis code words from said trellis code book index value, said trellis code book gain value and said side information parameter;
  
  a combination unit for combining said scaled trellis code words with said scaled ACB pitch excitation signal to create an excitation vector for a processed subframe; and
  
  a linear synthesis filter coupled to said combination unit, said linear synthesis filter configured to transform an excitation vector into a synthesized speech signal.
- View Dependent Claims (17)
- - 17. The trellis excited linear productive synthesizer as recited in claim 16 wherein said trellis code word unit is further comprised of a trellis encoder and a trellis code book.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
XVD Technology Holdings Ltd.
Original Assignee
Alaris, Inc., GT Technologies
Inventors
Egorov, Vladimir V., Ovsjannikov, Eugene P., Kudrjashov, Boris D., Kolesnik, Victor D., Krachkovsky, Victor Yu, Trojanovsky, Boris K.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Dorvil, Richemond

Application Number

US08/665,642
Time in Patent Office

427 Days
Field of Search

395/2.28, 395/2.51, 395/2.74, 395/2.77, 395/2, 395/2.71, 395/2.45, 395/2.32, 395/2.14
US Class Current

704/219
CPC Class Codes

G10L 19/12 the excitation function bei...

Speech compressor using trellis encoding and linear prediction

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

56 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Speech compressor using trellis encoding and linear prediction

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links