Speech compressor using trellis encoding and linear prediction
First Claim
1. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:
- a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames;
interpolation means for interpolating said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame;
ringing removal and perceptual weighting means for ringing removal and perceptual weighting said speech signals to produce predistorted speech vectors for successive subframes;
a long term prediction analyzer means coupled to said ringing removal and perceptual weighting means to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer means for generating long term prediction parameters and a scaled pitch component for the successive subframes;
pitch removal means for removing scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes;
trellis decoder means coupled to said pitch removal means to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level, said trellis decoder means for generating trellis gain and trellis path indexes for the successive subframes;
a trellis encoder storage for storing a predetermined trellis structure and list of trellis edge subblocks; and
a trellis encoder means coupled to said trellis decoder means to receive said trellis path indexes, said trellis encoder means for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks stored in said trellis encoder storage.
6 Assignments
0 Petitions
Accused Products
Abstract
A speech compressor utilizing Trellis Encoding and Linear Prediction (TELP). A TELP speech compressor provides improved signal generation and search technique for a code-excited linear prediction (CELP) speech encoder. TELP is a frame oriented coding that breaks the quantized speech signals into frames of prescribed length N and each frame into subframes of prescribed length L, which are processed as dependent units utilizing an analysis-by-synthesis approach. The approach is based on constructing the best mean square linear predicting filter and searching the best exciting sequence for the filter in order to produce synthesized speech. A trellis encoder is used instead of a stochastic code book. The Q-ary analysis of a given subframe and previous excitations is proposed for a fast vector search in an adaptive code book. It simplifies the implementation of digital speech compression.
56 Citations
17 Claims
-
1. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:
-
a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames; interpolation means for interpolating said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame; ringing removal and perceptual weighting means for ringing removal and perceptual weighting said speech signals to produce predistorted speech vectors for successive subframes; a long term prediction analyzer means coupled to said ringing removal and perceptual weighting means to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer means for generating long term prediction parameters and a scaled pitch component for the successive subframes; pitch removal means for removing scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes; trellis decoder means coupled to said pitch removal means to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level, said trellis decoder means for generating trellis gain and trellis path indexes for the successive subframes; a trellis encoder storage for storing a predetermined trellis structure and list of trellis edge subblocks; and a trellis encoder means coupled to said trellis decoder means to receive said trellis path indexes, said trellis encoder means for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks stored in said trellis encoder storage. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A trellis excited linear predictive coding method for processing digital speech signals, said digital speech signals partitioned into frames of a first predetermined length, each frame partitioned into subframes of a second predetermined length, each subframe partitioned into a third predetermined number of subblocks of a fourth length, said method comprising the steps of:
-
(a) performing a linear predictive analysis of an input digital speech signal to create frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames; (b) interpolating said frame linear prediction parameters to create subframe linear prediction parameters for successive subframes; (c) generating predistorted speech vectors for each of the successive subframes of said input digital speech signal; (d) performing long term prediction analysis of said predistorted speech vector for determination of long term prediction parameters and for generating a scaled pitch component for each of the successive subframes; (e) removing the scaled pitch component from said predistorted speech vector to produce decoder input vector u for each of the successive subframes; (f) trellis decoding said decoder input vector, said decoder input vector partitioned into a succession of speech subblocks u=(u1, u2, . . . , ut, . . . , ul), where the speech subblock ut,1<
t<
l, is processed at the trellis level t, for generating trellis gain gT and trellis path index IT for each of the successive subframes;(g) said gt and It identifying an excitation vector which is being used as an excitation for the decoder synthesis filter (DSF) and which produces a synthesized vector approximating in a predefined sense decoder input vector u; and (h) trellis encoding said trellis path index for generating a trellis code word for each of the successive subframes according to a predetermined trellis structure and a list of trellis edge subblocks stored in a trellis code book. - View Dependent Claims (8, 9)
-
-
10. A trellis excited linear predictive synthesizer for generating synthesized speech signals from a binary stream, said binary stream comprising encoded successive subframes of encoded speech signals, each of said successive subframes including an adaptive code book (ACB) index value, an ACB gain value, a trellis code book index value, a trellis code book gain value and a side information parameter for successive subframes, said trellis excited linear predictive synthesizer comprising:
-
a parsing means for receiving a binary stream and parsing out component parts of encoded successive subframes; pitch generation means for generating a scaled ACB pitch excitation signal from said adaptive code book index value, said adaptive code book gain value and side information parameter for successive subframes, trellis code word generation means for generating scaled trellis code words from said trellis code book index value, said trellis code book gain value and said side information parameter; combining means for combining said scaled trellis code words with said scaled ACB pitch excitation signal to create an excitation vector for a processed subframe; and a linear synthesis filter means coupled to said combining means, said linear synthesis filter means for transforming an excitation vector into a synthesized speech signal. - View Dependent Claims (11)
-
-
12. A trellis excited linear predictive coder for processing digital speech signals partitioned into frames of a first predetermined length, where each frame is partitioned into subframes of a second predetermined length and each subframe is partitioned into a third predetermined number of subblocks, each of said subblocks of a fourth predetermined length, said coder comprising:
-
a linear predictive analyzer responsive to a speech signal, said linear predictive analyzer for generating frame linear prediction parameters, said frame linear prediction parameters characterizing the short-time speech signal spectrum for successive frames; an interpolation module configured to interpolate said frame linear prediction parameters to produce subframe linear prediction parameters for successive subframes of a frame; a ringing removal and perceptual weighting unit configured to produce predistorted speech vectors for successive subframes; a long term prediction analyzer coupled to said ringing removal and perceptual weighting unit to receive said predistorted speech vectors for each of the successive subframes, said long term prediction analyzer for generating long term prediction parameters and a scaled pitch component for the successive subframes; a feedback loop configured to remove scaled pitch components from said predistorted speech vectors to produce decoder input vectors for the successive subframes; a trellis decoder for generating trellis gain and trellis path indexes for the successive subframes, said trellis decoder coupled to said feedback loop to receive said decoder input vectors, said decoder input vectors partitioned into a succession of speech subblocks, each of said speech subblocks being processed at a corresponding trellis level; a trellis encoder storage having stored therein a predetermined trellis structure and list of trellis edge subblocks; and a trellis encoder coupled to said trellis decoder to receive said trellis path indexes, said trellis encoder for generating trellis code words for the successive subframes according to said predetermined trellis structure and the list of trellis edge subblocks. - View Dependent Claims (13, 14, 15)
-
-
16. A trellis excited linear predictive synthesizer for generating synthesized speech signals from a binary stream, said binary stream comprising encoded successive subframes of encoded speech signals, each of said successive subframes including an adaptive code book (ACB) index value, an ACB gain value, a trellis code book index value, a trellis code book gain value and a side information parameter for successive subframes, said trellis excited linear predictive synthesizer comprising:
-
a parsing unit configured to receive a binary stream, said parsing unit parsing out component parts of encoded successive subframes; a pitch generator configured to produce a scaled ACB pitch excitation signal from said ACB index value, said ACB gain value and said side information parameter for successive subframes, a trellis code word unit configured to generate scaled trellis code words from said trellis code book index value, said trellis code book gain value and said side information parameter; a combination unit for combining said scaled trellis code words with said scaled ACB pitch excitation signal to create an excitation vector for a processed subframe; and a linear synthesis filter coupled to said combination unit, said linear synthesis filter configured to transform an excitation vector into a synthesized speech signal. - View Dependent Claims (17)
-
Specification