Digital speech coder with different excitation types
First Claim
1. A method for processing speech comprising the steps of:
- partitioning the speech into successive time frames;
generating for each frame a set of speech parameter signals defining a vocal tract;
generating a voiced signal for each of said speech frames comprising voiced speech;
generating an unvoiced signal for each of said speech frames comprising unvoiced speech;
producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal;
said step of producing said other than pitch type excitation information comprises the step of generating a sequence of pulses selected from pulses of a cross-correlation of an impulse response of said set of parameter signals and said speech for each frame;
combining signals for each of said frames to form a coded combined signal representative of the speech for each of said frames.
1 Assignment
0 Petitions
Accused Products
Abstract
An speech analysis and synthesis system where pitch information for excitation is transmitted during voiced segments of speech and modified residual information for excitation is transmitted during unvoiced speech segments along with linear predictive coded (LPC) parameters. The speech analysis portion of the system uses a pitch detection circuit to determine when the speech is voiced or unvoiced and to calculate the pitch information during voiced segments. A multi-pulse excitation forming circuit generates the modified residual signal which is obtained from the cross correlation of the residual signal and the LPC-recreated original signal. The pitch detection circuit controls a multiplexer which selects either the output of the multi-pulse excitation forming circuit or the output of the pitch detection circuit for transmission as the excitation information with LPC parameters to the synthesizer portion of the system.
47 Citations
10 Claims
-
1. A method for processing speech comprising the steps of:
-
partitioning the speech into successive time frames; generating for each frame a set of speech parameter signals defining a vocal tract; generating a voiced signal for each of said speech frames comprising voiced speech; generating an unvoiced signal for each of said speech frames comprising unvoiced speech; producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal; said step of producing said other than pitch type excitation information comprises the step of generating a sequence of pulses selected from pulses of a cross-correlation of an impulse response of said set of parameter signals and said speech for each frame;
combining signals for each of said frames to form a coded combined signal representative of the speech for each of said frames. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A speech processing system for human speech comprising:
-
means for storing a plurality of speech frames each having a predetermined number of evenly spaced samples of instantaneous amplitude of said speech; means for calculating a set of speech parameter signals defining a vocal tract for each speech frame; means for generating a voiced signal for each of said speech frames comprising voiced speech; means for generating an unvoiced signal for each of said speech frames comprising unvoiced speech; means for producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal; said means for producing said other than pitch type excitation information comprises means for performing a cross-correlation operation of an impulse response of said set of parameter signals and said speech for each of said frames to produce cross-correlated pulse signals and means for selecting a sequence of pulses from said cross-correlated pulses as said other than pitch type excitation information; and means for combining said produced coded excitation signal and said set of said speech parameter signals for each of said frames to form a coded combined signal representative of the speech for each of said frames. - View Dependent Claims (7, 8, 9, 10)
-
Specification