Digital speech coder with different excitation types

US 4,912,764 A
Filed: 08/28/1985
Issued: 03/27/1990
Est. Priority Date: 08/28/1985
Status: Expired due to Term

First Claim

Patent Images

1. A method for processing speech comprising the steps of:

partitioning the speech into successive time frames;

generating for each frame a set of speech parameter signals defining a vocal tract;

generating a voiced signal for each of said speech frames comprising voiced speech;

generating an unvoiced signal for each of said speech frames comprising unvoiced speech;

producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal;

said step of producing said other than pitch type excitation information comprises the step of generating a sequence of pulses selected from pulses of a cross-correlation of an impulse response of said set of parameter signals and said speech for each frame;

combining signals for each of said frames to form a coded combined signal representative of the speech for each of said frames.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An speech analysis and synthesis system where pitch information for excitation is transmitted during voiced segments of speech and modified residual information for excitation is transmitted during unvoiced speech segments along with linear predictive coded (LPC) parameters. The speech analysis portion of the system uses a pitch detection circuit to determine when the speech is voiced or unvoiced and to calculate the pitch information during voiced segments. A multi-pulse excitation forming circuit generates the modified residual signal which is obtained from the cross correlation of the residual signal and the LPC-recreated original signal. The pitch detection circuit controls a multiplexer which selects either the output of the multi-pulse excitation forming circuit or the output of the pitch detection circuit for transmission as the excitation information with LPC parameters to the synthesizer portion of the system.

47 Citations

View as Search Results

10 Claims

1. A method for processing speech comprising the steps of:
- partitioning the speech into successive time frames;
  
  generating for each frame a set of speech parameter signals defining a vocal tract;
  
  generating a voiced signal for each of said speech frames comprising voiced speech;
  
  generating an unvoiced signal for each of said speech frames comprising unvoiced speech;
  
  producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal;
  
  said step of producing said other than pitch type excitation information comprises the step of generating a sequence of pulses selected from pulses of a cross-correlation of an impulse response of said set of parameter signals and said speech for each frame;
  
  combining signals for each of said frames to form a coded combined signal representative of the speech for each of said frames.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein said step of generating said speech parameter signal set comprises the step of calculating a set of linear predictive parameters for each frame responsive to said speech of each frame.
  - 3. The method of claim 1 wherein said partitioning step comprises the step of forming speech samples of said speech for each of said frames and said speech samples having positive and negative values and generating residual samples of said speech pattern for each of said frames and said residual samples having positive and negative values and said step of producing said pitch type excitation information comprises the steps of:
    - estimating a first pitch value for each of said frames in response to positive valued ones of said speech samples of each frame;
      
      estimating a second pitch value for each of said frames in response to negative valued ones of said speech samples of each frame;
      
      estimating a third pitch value for each of said frames in response to positive valued ones of said residual samples;
      
      estimating a fourth pitch value for each of said frames in response to negative valued ones of said residual samples for each frame; and
      
      determining a final pitch value of a last previous speech frame in response to said estimated first, second, third, and fourth pitch values for said previous speech frame and pitch values for a plurality of previous speech frames and a present speech frame.
  - 4. The method of claim 3 wherein said determining step comprises the steps of:
    - calculating a pitch value from said ones of said estimated first, second, third, and fourth pitch values; and
      
      constraining said final pitch value so that the calculated pitch value is in agreement with calculated pitch values from previous frames.
  - 5. The method for processing speech of claim 1 further comprises the steps of:
    - generating a received voiced signal upon receipt of the combined coded signal having pitch type excitation information;
      
      generating a received unvoiced signal upon receipt of said combined coded signal having said other than pitch noise type excitation information;
      
      modeling said vocal tract in response to said set of speech parameter signals for each frame;
      
      synthesizing each frame of speech utilizing said pitch excitation information upon said received voiced signal being generated; and
      
      synthesizing each frame of speech utilizing said other than pitch type excitation information upon generation of said received unvoiced signal.

6. A speech processing system for human speech comprising:
- means for storing a plurality of speech frames each having a predetermined number of evenly spaced samples of instantaneous amplitude of said speech;
  
  means for calculating a set of speech parameter signals defining a vocal tract for each speech frame;
  
  means for generating a voiced signal for each of said speech frames comprising voiced speech;
  
  means for generating an unvoiced signal for each of said speech frames comprising unvoiced speech;
  
  means for producing a coded excitation signal comprising pitch type excitation information for each of said speech frames designated as voiced by said voiced signal and other than pitch type excitation information for each of said speech frames designated as unvoiced by said unvoiced signal;
  
  said means for producing said other than pitch type excitation information comprises means for performing a cross-correlation operation of an impulse response of said set of parameter signals and said speech for each of said frames to produce cross-correlated pulse signals and means for selecting a sequence of pulses from said cross-correlated pulses as said other than pitch type excitation information; and
  
  means for combining said produced coded excitation signal and said set of said speech parameter signals for each of said frames to form a coded combined signal representative of the speech for each of said frames.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The system of claim 6 wherein said means for generating said set of speech parameter signals comprises means for calculating a set of linear predictive coded parameters for each of said frames.
  - 8. The system of claim 6 wherein said means for producing said pitch type excitation information comprises:
    - each of a plurality of identical means responsive to an individual predetermined portion of said samples of each of said frames for individually estimating a pitch value for each of said frames; and
      
      means responsive to the individually estimated pitch values from each of said estimating means for determining a final pitch value for each of said frames.
  - 9. The system of claim 8 wherein said determining means comprises:
    - means for constraining said final pitch value so that the calculated pitch value for each of said frames is in agreement with the calculated pitch values from previous ones of said frames.
  - 10. The system of claim 6 further comprises means for receiving said coded combined signal;
    - means for generating a received voiced signal upon the received coded combined signal having pitch type excitation information;
      
      means for generating a received unvoiced signal upon said received coded combined signal having said other than pitch type excitation information;
      
      means for synthesizing each frame of speech utilizing said set of speech parameter signals and said pitch excitation information upon said received voiced signal being generated; and
      
      said synthesizing means further responsive to said set of speech parameter signals and said received unvoiced signal for utilizing said other than pitch type excitation information to synthesize each frame of speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Bell Telephone Laboratories, Inc. (Nokia Corporation)
Original Assignee
American Telephone & Telegraph Company (AT&T, Inc.)
Inventors
Picone, Joseph, Prezas, Dimitrios P., Hartwell, Walter T.
Primary Examiner(s)
Clark, David L.
Assistant Examiner(s)
Merecki, John A.

Application Number

US06/770,632
Time in Patent Office

1,672 Days
Field of Search

381/36-41, 381/49, 381/29-35, 381/51-53, 364/513.5
US Class Current

704/261
CPC Class Codes

G10L 19/10 the excitation function bei...

Digital speech coder with different excitation types

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

47 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Digital speech coder with different excitation types

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links