Low data rate speech encoder with mixed excitation
First Claim
1. A signal transmission system, for receiving information signals at an input port, and for transmitting the information signals over a limited-bandwidth path to a reproducing arrangement, said system comprising:
- pitch tracking means coupled to said input port, for determining the average pitch value of said information signals during each of a plurality of recurrent, sequential frame intervals, to thereby generate average pitch value signals;
frame portion power determining means coupled to said input port, for dividing each of said frames into at least first and second temporal portions, and for determining the power in said information signals during at least said first temporal portion of each of said frames, to produce frame power signals;
autoregression coefficient analyzing means coupled to said input port, for, during each of said frame intervals, generating at least ten autoregression coefficients from said information signals, representing line spectrum frequencies;
pitch epoch detecting means coupled to said input port and to said autoregression coefficient analyzing means, for, during each frame, determining at least amplitude and time interval pitch parameters of said information signals, to thereby produce pitch parameter signals;
periodicity analysis means coupled to said pitch epoch detecting means, for analyzing said pitch parameter signals, to form periodicity parameter signals in response to the presence or absence of voiced components in said speech and the periodicity of pitch pulses if said pitch pulses are periodic, and the ratio of the largest to the smallest pitch intervals if said pitch pulses are aperiodic, to produce jitter-representative signals;
a filter bank coupled to said input port, said filter bank including a plurality of filters, each covering a different portion of the expected bandwidth of said information signals, for filtering said information signals into a plurality of nonoverlapping frequency bands, to thereby form a plurality of bandlimited signals;
correlation means coupled to said filter bank and to said pitch tracking means, for correlating said band-limited signals at an interval responsive to said average pitch value signals, to thereby form estimated mixture signals;
jitter correction means coupled to said correlation means and to said periodicity analysis means, for correcting said estimated mixture signals in response to said periodicity parameter signals, to thereby generate corrected correlation signals;
coding means coupled to said pitch tracking means, to said frame portion power determining means, to said correlation means, and to said autoregression coefficient analyzing means, for generating codewords representative of said average pitch value signals, said frame power signals, estimated mixture signals, and said line spectrum frequencies, respectively, for producing codes;
codeword generating means coupled to said coding means and to said periodicity analysis means, for joining said codes with said jitter-representative signals to form codewords for transmission;
transmitting means coupled to said codeword generated means, for transmitting said codewords over said path; and
reproducing means, coupled to a receiving end of said path, for receiving said codewords, and for decoding said codewords, and for generating a simile of said information signals.
6 Assignments
0 Petitions
Accused Products
Abstract
A speech signal has its characteristics extracted and encoded (16), transmitted over a limited-data-rate path (18) and is decoded (20) and synthesized (22) at the receiving end. The characteristics include line spectral frequencies (LSF), pitch and jitter. The LSF are extracted by autoregression, and splitvector quantized (SVQ) in a single frame, and, in parallel, in blocks of two, three and four frames. The SVQ codes have equal length and are evaluated for distortion in conjunction with a threshold. The threshold is varied in such a manner as tend to select for transmission those codewords which maintain a constant data rate into a transmit buffer. A single-bit jitter bit, and encoded pitch value, are product coded with the selected LSF codeword, and all are transmitted over the data path (18) to the receiver. The receiver decodes the characteristics, and controls a pitch generated (1226) in response to the pitch value and a random pitch jitter in response to the jitter bit. Two sets of line spectrum filters receive random noise and the pitch signal, respectively. The filtered signals are modulated by multipliers (1222, 1230) controlled by the LSF codes, and the filtered signals are summed and applied to a final LSF-controlled filter.
68 Citations
5 Claims
-
1. A signal transmission system, for receiving information signals at an input port, and for transmitting the information signals over a limited-bandwidth path to a reproducing arrangement, said system comprising:
-
pitch tracking means coupled to said input port, for determining the average pitch value of said information signals during each of a plurality of recurrent, sequential frame intervals, to thereby generate average pitch value signals; frame portion power determining means coupled to said input port, for dividing each of said frames into at least first and second temporal portions, and for determining the power in said information signals during at least said first temporal portion of each of said frames, to produce frame power signals; autoregression coefficient analyzing means coupled to said input port, for, during each of said frame intervals, generating at least ten autoregression coefficients from said information signals, representing line spectrum frequencies; pitch epoch detecting means coupled to said input port and to said autoregression coefficient analyzing means, for, during each frame, determining at least amplitude and time interval pitch parameters of said information signals, to thereby produce pitch parameter signals; periodicity analysis means coupled to said pitch epoch detecting means, for analyzing said pitch parameter signals, to form periodicity parameter signals in response to the presence or absence of voiced components in said speech and the periodicity of pitch pulses if said pitch pulses are periodic, and the ratio of the largest to the smallest pitch intervals if said pitch pulses are aperiodic, to produce jitter-representative signals; a filter bank coupled to said input port, said filter bank including a plurality of filters, each covering a different portion of the expected bandwidth of said information signals, for filtering said information signals into a plurality of nonoverlapping frequency bands, to thereby form a plurality of bandlimited signals; correlation means coupled to said filter bank and to said pitch tracking means, for correlating said band-limited signals at an interval responsive to said average pitch value signals, to thereby form estimated mixture signals; jitter correction means coupled to said correlation means and to said periodicity analysis means, for correcting said estimated mixture signals in response to said periodicity parameter signals, to thereby generate corrected correlation signals; coding means coupled to said pitch tracking means, to said frame portion power determining means, to said correlation means, and to said autoregression coefficient analyzing means, for generating codewords representative of said average pitch value signals, said frame power signals, estimated mixture signals, and said line spectrum frequencies, respectively, for producing codes; codeword generating means coupled to said coding means and to said periodicity analysis means, for joining said codes with said jitter-representative signals to form codewords for transmission; transmitting means coupled to said codeword generated means, for transmitting said codewords over said path; and reproducing means, coupled to a receiving end of said path, for receiving said codewords, and for decoding said codewords, and for generating a simile of said information signals. - View Dependent Claims (2)
-
-
3. A method for transmitting information in the form of speech signals over a limited-data-rate data path, comprising the steps of:
-
separating those portions of input speech signals containing jitter from those portions which do not contain jitter, to thereby produce (a) jittering speech signals containing varying pitch intervals, and (b) non-jittering speech signals; determining, on a frame-by-frame basis, the variation in the pitch intervals in said jittering speech signals; comparing said variation with a threshold; generating a particular state of a one-bit jitter signal when said variation exceeds said threshold, and generating the other state otherwise; transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal; generating a pitch signal, defining pitch intervals, at the receiving end of said data path; and when said transmitted jitter signal is in said particular state, randomly varying said pitch intervals of said pitch signal.
-
-
4. A method for coding digital temporally related speech signals including spectra, comprising the steps of:
-
providing memorized monotonic spectrum values identified by codewords; dividing said speech signals into nonoverlapping blocks, each of which includes said spectra; taking the differences between a lower set of said spectra in one of said blocks and the remaining signals in said block, to generate difference signals; comparing said difference signals and said one signal in each of said blocks with said memorized values; in response to said comparisons, assigning to each of said difference signals a codeword representing that one of said memorized signals which is the closest match to that one of said difference signals; in response to said comparisons, assigning to said one of said signals in each of said blocks a codeword representing that one of said memorized signals which is the closest match to said one of said signals; generating a combination codeword for each of said blocks by product coding. - View Dependent Claims (5)
-
Specification