Speech coder and method having spectral interpolation and fast codebook search
First Claim
1. A method for reconstructing a signal that has been partitioned into successive time interval partitions, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, the method utilizing at least a codebook unit having at least a codebook memory, a synthesis unit having at least a first synthesis filter, a combiner, and a perceptual weighting unit having at least a first perceptual weighting filter, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the method comprising the steps of:
- (1A) utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter;
(1B) utilizing the at least first synthesis filter to obtain the corresponding impulse response representation, and interpolating the impulse responses of each adjacent time signal partition and of a current time signal partition immediately thereafter to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters to provide a corresponding set of interpolated perceptual weighting filters for desired subpartitions;
such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;
(1C) utilizing the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition to select the corresponding set of optimal excitation codevectors from the at least first codebook memory, further implementing the following steps for each desired input reference signal vector;
(1C1) providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors;
(1C2) inputting the particular excitation codevector into the corresponding interpolated synthesis filter to produce the synthesized signal vector;
(1C3) subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector;
(1C4) inputting the reconstruction error vector into the corresponding interpolated perceptual weighting unit to determine a corresponding perceptually weighted squared error;
(1C5) determining and storing index of codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors;
(1C6) repeating the steps (1C1), (1C2), (1C3), (1C4), and (1C5) for every excitation codevector in the codebook memory and implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and
(D) successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A novel spectral interpolation and efficient excitation codebook search method developed for a Code-Excited Linear Predictive (CELP) speech coder is set forth. The interpolation is performed on an impulse response of the spectral synthesis filter. As the result of using this new set of interpolation parameters, the computations associated with an excitation codebook search in a CELP coder are considerably reduced. Furthermore, a coder utilizing this new interpolation approach provides noticeable improvement in speech quality coded at low bit-rates.
-
Citations
75 Claims
-
1. A method for reconstructing a signal that has been partitioned into successive time interval partitions, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, the method utilizing at least a codebook unit having at least a codebook memory, a synthesis unit having at least a first synthesis filter, a combiner, and a perceptual weighting unit having at least a first perceptual weighting filter, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the method comprising the steps of:
-
(1A) utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter; (1B) utilizing the at least first synthesis filter to obtain the corresponding impulse response representation, and interpolating the impulse responses of each adjacent time signal partition and of a current time signal partition immediately thereafter to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters to provide a corresponding set of interpolated perceptual weighting filters for desired subpartitions;
such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;(1C) utilizing the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition to select the corresponding set of optimal excitation codevectors from the at least first codebook memory, further implementing the following steps for each desired input reference signal vector; (1C1) providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors; (1C2) inputting the particular excitation codevector into the corresponding interpolated synthesis filter to produce the synthesized signal vector; (1C3) subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector; (1C4) inputting the reconstruction error vector into the corresponding interpolated perceptual weighting unit to determine a corresponding perceptually weighted squared error; (1C5) determining and storing index of codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors; (1C6) repeating the steps (1C1), (1C2), (1C3), (1C4), and (1C5) for every excitation codevector in the codebook memory and implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and (D) successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for reconstructing a speech signal pattern in a digital speech coder, the signal being partitioned into successive time intervals, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, the method utilizing at least a codebook unit having at least a codebook memory, a gain adjuster where selected, a synthesis unit having at least a first synthesis filter, a combiner, and a perceptual weighting unit having at least a first perceptual weighting filter, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the method comprising the steps of:
-
(17A) utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter; (17B) utilizing the at least first synthesis filter to obtain the corresponding impulse response representation, and interpolating the impulse responses of each adjacent time signal partition and of a time signal partition immediately thereafter to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters to provide a corresponding set of interpolated perceptual weighting filters for desired subpartitions;
such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;(17C) utilizing the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition to select the corresponding set of optimal excitation codevectors from the at least first codebook memory, further implementing the following steps for each desired input reference signal vector; (17C1) providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors; (17C2) inputting the particular excitation codevector into the corresponding interpolated synthesis filter to produce the synthesized signal vector; (17C3) subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector; (17C4) inputting the reconstruction error vector into the corresponding interpolated perceptual weighting unit to determine a corresponding perceptually weighted squared error; (17C5) determining and storing index of codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors; (17C6) repeating the steps (17C1), (17C2), (17C3), (17C4), and (17C5), for every excitation codevector in the codebook memory and implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and (D) successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A device for reconstructing a signal, the signal being partitioned into successive time intervals, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the device comprising at least:
-
(33A) a first synthesis unit, responsive to the at least first representative electrical signal for each representative input reference signal, for utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter and the impulse response of this synthesis filter, and for interpolating the impulse responses of each adjacent time signal partition and of a current time signal partition immediately thereafter to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters to provide a corresponding set of interpolated perceptual weighting filters to at least a first perceptual weighting unit for desired subpartitions such that the at least first perceptual weighting unit provides at least a first perceptually weighted squared error and such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;(33B) a codebook unit, responsive to the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition, for selecting the corresponding set of optimal excitation codevectors from the at least first codebook memory for each desired input reference signal vector, further comprising at least; (33B1) a codebook memory, for providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors; (33B2) an interpolated synthesis filter having a transfer function, responsive to the particular excitation codevector for producing a synthesized signal vector; (33B3) a combiner, responsive to the synthesized signal vector and to the input reference signal vector related thereto, for subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector; (33B4) an interpolated perceptual weighting unit, responsive to the corresponding reconstruction error vector and to the interpolated synthesis filter transfer function, for determining a corresponding perceptually weighted squared error; (33B5) a selector, responsive to the corresponding perceptually weighted squared error for determining and storing an index of a codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors; (33B6) repetition means, responsive to the number of excitation codevectors in the codebook memory, for repeating the steps (33B1), (33B2), (33B3), (33B4), and (33B5) for every excitation codevector in the codebook memory and for implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and (33C) codebook unit control means, responsive to the set of optimal excitation codevectors for successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48)
-
-
49. A device for reconstructing a speech signal in a digital speech coder, the signal being partitioned into successive time intervals, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the device comprising at least:
-
(49A) a first synthesis unit, responsive to the at least first representative electrical signal for each representative input reference signal, for utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter and the impulse response of this synthesis filter, and for interpolating the impulse responses of each adjacent time signal partition and of a current time signal partition immediately thereafter to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters to provide a corresponding set of interpolated perceptual weighting filters to at least a first perceptual weighting unit for desired subpartitions such that the at least first perceptual weighting unit provides at least a first perceptually weighted squared error and such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;(49B) a codebook unit, responsive to the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition, for selecting the corresponding set of optimal excitation codevectors from the at least first codebook memory for each desired input reference signal vector, further comprising at least; (49B1) a codebook memory, for providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors; (49B2) an interpolated synthesis filter having a transfer function, responsive to the particular excitation codevector for producing a synthesized signal vector; (49B3) a combiner, responsive to the synthesized signal vector and to the input reference signal vector related thereto, for subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector; (49B4) an interpolated perceptual weighting unit, responsive to the corresponding reconstruction error vector and to the interpolated synthesis filter transfer function, for determining a corresponding perceptually weighted squared error; (49B5) a selector, responsive to the corresponding perceptually weighted squared error for determining and storing an index of a codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors; (49B6) repetition means, responsive to the number of excitation codevectors in the codebook memory, for repeating the steps (49B1), (49B2), (49B3), (49B4), and (49B5) for every excitation codevector in the codebook memory and for implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and (D) codebook unit control means, responsive to the set of optimal excitation codevectors for successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal. - View Dependent Claims (50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
-
-
64. A system for reconstructing a speech signal in a digital speech coder, the signal being partitioned into successive time intervals, each time interval signal partition having a representative input reference signal with a set of vectors, and having at least a first representative electrical signal for each representative input reference signal of each time interval signal partition, for utilizing the electrical signals of the representative input reference signals to at least generate a related set of synthesized signal vectors for reconstructing the signal, the system comprising at least:
-
(64A) a first synthesis unit, responsive to the at least first representative electrical signal for each representative input reference signal, for utilizing the at least first representative electrical signal for each representative input reference signal for a time signal partition to obtain a set of uninterpolated parameters for the at least first synthesis filter and the impulse response of this synthesis filter, and having a first synthesis filter, the at least first synthesis filter being at least a first time-varying linear predictive coding synthesis filter (LPC-SF) wherein the at least first LPC-SF has a transfer function substantially of a form;
##EQU72## where ai '"'"'s, for i=1,2, . . . , p represent a set of estimated prediction coefficients obtained by analyzing the corresponding time signal partition and p represents a predictor order, responsive to the set of uninterpolated parameters, for obtaining the corresponding impulse response representation, and interpolating the impulse responses of each adjacent time signal partition and of a current time signal partition immediately thereafter, wherein the LPC-SFs of a adjacent time signal partition and of a time partition immediately thereafter are substantially of a form;
##EQU73## where ai.sup.(j) '"'"'s, for i=1, 2, 3, . . . , p and j=1, 2 represent a set of prediction coefficients in an adjacent time signal partition when j=1 and of a current time signal partition immediately thereafter when j=2, respectively, p represents a predictor order such thatan impulse response for the transfer function H.sup.(j) (z) is substantially of a form ##EQU74## where ∂
(n) is a unit sample function, and such that the impulse response of the at least first synthesis filter at an m-th subpartition of a current time partition obtained through linear interpolation of h.sup.(1) (n) and h.sup.(2) (n) respectively, denoted below as hm (n), is substantially of a form;
space="preserve" listing-type="equation">h.sub.m (n)=α
.sub.m h.sup.(1) (n)+β
.sub.m h.sup.(2) (n),where β
m =1-α
m and 0<
α
m <
1, where a different α
m is utilized for each subpartition, thereby providing a transfer function of an interpolated synthesis filter substantially of a form;
##EQU75## wherein the perceptual weighting filter at the m-th subpartition of a current time interval signal partition has a transfer function of the form;
##EQU76## where γ
is typically selected to be substantially 0.8, to provide a set of interpolated synthesis filters for desired subpartitions; and
utilizing the interpolated synthesis filters, to provide a corresponding set of interpolated perceptual weighting filters to at least a first perceptual weighting unit for desired subpartitions such that the at least first perceptual weighting unit provides at least a first perceptually weighted squared error and such that smooth transitions of the synthesis filter and the perceptual weighting filter between each pair of adjacent partitions are obtained;(64B) a codebook unit, responsive to the set of input reference signal vectors, the related set of interpolated synthesis filters and the related set of interpolated perceptual weighting filters for the current time signal partition, for selecting the corresponding set of optimal excitation codevectors from the at least first codebook memory for each desired input reference signal vector, further comprising at least; (64B1) a first codebook memory, for providing a particular excitation codevector which is associated with a particular index from the at least first codebook memory, the codebook memory having a set of excitation codevectors stored therein responsive to the representative input vectors; (64B2) an interpolated synthesis filter having a transfer function, responsive to the particular excitation codevector for producing a synthesized signal vector; (64B3) a combiner, responsive to the synthesized signal vector and to the input reference signal vector related thereto, for subtracting the synthesized signal vector from the input reference signal vector related thereto to obtain a corresponding reconstruction error vector; (64B4) an interpolated perceptual weighting unit, responsive to the corresponding reconstruction error vector and to the interpolated synthesis filter transfer function, for determining a corresponding perceptually weighted squared error; (64B5) a selector, responsive to the corresponding perceptually weighted squared error for determining and storing an index of a codevector having the perceptually weighted squared error smaller than all other errors produced by other codevectors; (64B6) repetition means, responsive to the number of excitation codevectors in the codebook memory, for repeating the steps (64B1), (64B2), (64B3), (64B4), and (64B5) for every excitation codevector in the codebook memory and for implementing these steps utilizing a fast codebook search method, to determine an optimal excitation codevector for producing the minimum weighted squared error among all excitation codevectors for the related input reference signal vector; and (C) codebook unit control means, responsive to the set of optimal excitation codevectors for successively inputting the set of optimal excitation codevectors into the corresponding set of interpolated synthesis filters to produce the related set of synthesized signal vectors for the given input reference signal for reconstructing the input signal. - View Dependent Claims (65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75)
-
Specification