Method and system for sub-band hybrid coding
First Claim
1. A system for processing an input signal, the system comprising:
- means for separating the input signal into at least two sub-band signals;
first means for encoding one of said at least two sub-band signals using a first encoding algorithm to produce at least one encoded output signal, said first means for encoding further comprising means for detecting a gain mismatch between said at least two sub-band signals; and
means for adjusting said gain mismatch detected by said detecting means; and
second means for encoding another of said at least two sub-band signals using a second encoding algorithm to produce at least one other encoded output signal, where said first encoding algorithm is different from said second encoding algorithm.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method are provided for processing audio and speech signals using a pitch and voicing dependent spectral estimation algorithm (voicing algorithm) to accurately represent voiced speech, unvoiced speech, and mixed speech in the presence of background noise, and background noise with a single model. The present invention also modifies the synthesis model based on an estimate of the current input signal to improve the perceptual quality of the speech and background noise under a variety of input conditions. The present invention also improves the voicing dependent spectral estimation algorithm robustness by introducing the use of a Multi-Layer Neural Network in the estimation process. The voicing dependent spectral estimation algorithm provides an accurate and robust estimate of the voicing probability under a variety of background noise conditions. This is essential to providing high quality intelligible speech in the presence of background noise. In one embodiment, the waveform coding is implemented by separating the input signal into at least two sub-band signals and encoding one of the at least two sub-band signals using a first encoding algorithm to produce at least one encoded output signal; and encoding another of said at least two sub-band signals using a second encoding algorithm to produce at least one other encoded output signal, where the first encoding algorithm is different from the second encoding algorithm. In accordance with the described embodiment, the present invention provides an encoder that codes N user defined sub-band signal in the baseband with one of a plurality of waveform coding algorithms, and encodes N user defined sub-band signals with one of a plurality of parametric coding algorithms. That is, the selected waveform/parametric encoding algorithm may be different in each sub-band.
-
Citations
36 Claims
-
1. A system for processing an input signal, the system comprising:
-
means for separating the input signal into at least two sub-band signals;
first means for encoding one of said at least two sub-band signals using a first encoding algorithm to produce at least one encoded output signal, said first means for encoding further comprising means for detecting a gain mismatch between said at least two sub-band signals; and
means for adjusting said gain mismatch detected by said detecting means; and
second means for encoding another of said at least two sub-band signals using a second encoding algorithm to produce at least one other encoded output signal, where said first encoding algorithm is different from said second encoding algorithm. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
means for receiving and substantially reconstructing said at least two sub-band signals from said multiplexed encoded output signal; - and
means for combining said substantially reconstructed said at least two sub-band signals to substantially reconstruct said input signal.
-
-
7. The system of claim 6, wherein said means for combining further comprises means for maintaining waveform phase alignment between said at least one encoded output signal from said first means for encoding with said one other encoded output signal from said second means for encoding.
-
8. The system of claim 6, wherein said means for reconstructing further comprises:
-
means for decoding said at least one encoded output signal at a first sampling rate using a first decoding algorithm; and
means for decoding said at least one other encoded output signal at a second sampling rate using a second decoding algorithm.
-
-
9. The system of claim 8, wherein said means for reconstructing further comprises means for adjusting one of said first and second sampling rates such that said first sampling rate is equal to said second sampling rate.
-
10. The system of claim 1, wherein said first means for encoding is a waveform encoder.
-
11. The system of claim 10, wherein said waveform encoder is selected from the group consisting of at least a pulse code modulation (PCM) encoder, adaptive differential PCM encoder, code excited linear prediction (CELP) encoder, relaxed CELP encoder and transform coding encoder.
-
12. The system of claim 1, wherein said second means for encoding Is a parametric encoder.
-
13. The system of claim 12, wherein said parametric encoder is selected from the group consisting of at least a sinusoidal transform encoder, harmonic encoder, multi band excitation vocoder (MBE) encoder, mixed excitation linear prediction (MELP) encoder and waveform interpolation encoder.
-
14. A system for processing an input signal, the system comprising:
-
a hybrid encoder comprising;
means for separating the input signal into a first signal and a second signal;
means for detecting a gain mismatch between said first signal and said second signal;
means for adjusting for said gain mismatch detected by said detecting means;
means for processing the first signal to derive a baseband signal;
means for encoding the baseband signal using a relaxed code excited linear prediction (RCELP) encoder to derive a baseband RCELP encoded signal;
means for encoding the second signal using a harmonic encoder to derive a harmonic encoded signal; and
means for multiplexing said baseband RCELP encoded signal with said harmonic encoded signal to form a multiplexed hybrid encoded signal. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
a decoder comprising;
means for substantially reconstructing said first and second signals from said multiplexed hybrid encoded signal; and
means for combining said substantially reconstructed first and second signals to substantially reconstruct said input signal.
-
-
18. The system of claim 17, wherein said means for substantially reconstructing further comprises:
-
means for decoding said first signal at a first sampling rate using a first decoding algorithm; and
means for decoding said second signal at a second sampling rate using a second decoding algorithm.
-
-
19. The system of claim 18, wherein said means for reconstructing further comprises means for adjusting one of said first and second sampling rates such that said first sampling rate is equal to said second sampling rate.
-
20. The system of claim 17, wherein said combining means further comprises means for maintaining waveform phase alignment.
-
21. The system of claim 17, wherein said means for decoding further comprises
means for detecting a gain mismatch between said first and second signals; - and
means for adjusting for said gain mismatch detected by said detecting means.
- and
-
22. A hybrid encoder for encoding audio and speech signals, the hybrid encoder comprising:
-
means for separating an input signal into a first signal and a second signal;
means for detecting a gain mismatch between said first signal and a second signal;
means for adjusting for said gain mismatch detected by said detecting means;
means for processing the first signal to derive a baseband signal;
means for encoding said baseband signal using a relaxed code excited linear prediction (RCELP) encoder to derive a baseband RCELP encoded signal;
means for encoding the second signal using a harmonic encoder to derive a harmonic encoded signal; and
means for combining said baseband RCELP encoded signal with said harmonic encoded signal to form a combined hybrid encoded signal. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
means for high-pass filtering and buffering an input signal comprised of a plurality of consecutive frames to derive a preprocessed signal, ps(m);
means for analyzing a current frame and at least one previously received frame from among said plurality of frames to derive a pitch period estimate;
means for analyzing said pre-processed signal, ps(m), and said pitch period estimate to estimate a voicing cutoff frequency and to derive an all-pole model of the frequency response of the current speech frame dependent on said pitch period estimate, said voicing cutoff frequency, and ps(m);
means for outputting a line spectral frequency (LSF) representation of the all-pole model and a frame gain of the current frame; and
means for quantizing said LSF representation, said voicing cutoff frequency, and said frame gain to derive a quantized LSF representation, a quantized voicing cutoff frequency, and a quantized frame gain.
-
-
24. The hybrid encoder of claim 22, wherein said means for encoding said baseband signal using a RCELP encoder comprises:
-
means for deriving a preprocessed signal, shp(m), from said input signal comprised of a plurality of frames where each frame is further comprised of at least two sub-frames;
means for upsampling said pre-processed signal, shp(m) to derive an interpolated baseband signal, is(i), at a first sampling rate;
means for deriving a baseband signal, s(n), at a second sampling rate, wherein said second sampling rate is less than said first sampling rate;
means for refining the pitch period estimate to derive a refined pitch period estimate;
means for quantizing the refined pitch period estimate to derive a quantized pitch period estimate;
means for linearly interpolating the quantized pitch period estimate to derive a pitch period contour array, ip(i);
means for generating a modified baseband signal, sm(n), having a pitch period contour which tracks the pitch period contour array, ip(i); and
means for controlling a time asynchrony between said baseband signal, s(n), and said modified baseband signal, sm(n).
-
-
25. The hybrid encoder of claim 24, wherein said second sampling rate is a Nyquist rate.
-
26. The hybrid encoder of claim 24, wherein the means for refining the pitch period estimate further comprises means for using a window centered at the end of one of said plurality of frames having a window length equal to one of the pitch period estimate and an amount bounded by a look-ahead output of the hybrid encoder.
-
27. The hybrid encoder of claim 24, wherein said means for deriving said baseband signal, s(n), at said second sampling rate comprises decimating said interpolated baseband signal, is(i), at said second sampling rate.
-
28. The hybrid encoder of claim 24, wherein said means for refining the pitch period estimate comprises:
-
means for receiving said pitch period estimate from said harmonic encoder;
means for constructing a search window encompassing said pitch period estimate; and
means for searching within said search window for determining an optimal time lag which maximizes a normalized correlation function of the signal, shp(m).
-
-
29. The hybrid encoder of claim 24, further comprising means for generating an adaptive codebook vector, v(n), based on a previously quantized excitation signal, u(n).
-
30. The hybrid encoder of claim 29, wherein the means for generating said adaptive codebook vector, v(n), comprises:
-
means for determining a last pitch period cycle of said quantized excitation signal, u(n);
means for stretching/compressing the time scale of the last pitch period cycle of said previously quantized excitation signal, u(n); and
means for copying said stretched/compressed last pitch period cycle in a current subframe according to said pitch period contour array, ip(i).
-
-
31. The hybrid encoder of claim 24, further comprising means for converting an array of quantized line spectral frequency (LSF) coefficients into an array of baseband linear prediction (LPC) coefficients.
-
32. The hybrid encoder of claim 31, wherein the LPC array is used to derive coefficients associated with a perceptual weighting filter, and are further used to update coefficients associated with a short-term synthesis filter.
-
33. The hybrid encoder of claim 24, further comprising means for finding an optimal combination of fixed codebook pulse locations and pulse signs which minimizes the energy of a weighted coding error signal, ew(n), within a current subframe.
-
34. The hybrid encoder of claim 24, further comprising means for calculating and quantizing adaptive and fixed codebook gains.
-
35. A hybrid decoder for decoding a hybrid encoded signal, the decoder comprising:
-
processing means comprising;
means for receiving a hybrid encoded bit-stream from a communication channel;
means for demultiplexing the received bit-stream into a plurality of bit-stream groups according to at least one quantizing parameter;
means for unpacking the plurality of bit-stream groups into quantizer output indices;
means for decoding the quantizer output indices into quantized parameters; and
means for providing the quantized parameters to a relaxed code excited linear prediction (RCELP) decoder to decode a baseband RCELP output signal, said quantized parameters further being provided to a harmonic decoder to decode a full-band harmonic signal;
means for detecting a gain mismatch between said baseband RCELP outDut signal and said full-band harmonic signal;
means for adjusting for said gain mismatch detected by said detecting means; and
means for combining outputs from said RCELP decoder and said harmonic decoder to provide a full-band output signal. - View Dependent Claims (36)
-
Specification