Voice messaging system with pitch tracking based on adaptively filtered LPC residual signal

US 4,731,846 A
Filed: 04/13/1983
Issued: 03/15/1988
Est. Priority Date: 04/13/1983
Status: Expired due to Term

First Claim

Patent Images

1. A voice messaging system for encoding and regenerating human speech comprising:

LPC analysis means for analyzing an analog speech signal provided as an input thereto in accordance with an LPC (Linear Predictive Coding) model, said LPC analysis means providing LPC parameters and a residual signal as an output representative of the analog speech signal;

adaptive filter means operably coupled to the output of said LPC analysis means for receiving said residual signal and at least one LPC parameter from said LPC analysis means, said adaptive filter means filtering said residual signal in accordance with a time-varying filter characteristic defined by said at least one LPC parameter, wherein the time-varying filter characteristic provides for the removal of high frequency noise from the residual signal during periods of voiced speech and for the retention of high frequency energy in the residual signal during periods of unvoiced speech, to provide an adaptively filtered residual signal as an output therefrom;

means operably connected to the output of said adaptive filter means for extracting pitch and voicing information from said adaptively filtered residual signal; and

means operably connected to the outputs of said extracting means and said LPC analysis means for encoding said pitch and voicing information and said LPC parameters.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A voice messaging system, wherein linear predictive coding (LPC) parameters, pitch, and preferably other excitation information is derived from a human voice input, encoded, and transmitted and/or stored, to be called up later to provide a speech output which is nearly identical to the original speech input. The invention features adaptive filtering of the residual signal. The residual signal derived from LPC estimation is adaptively filtered, and then is used as the input to a conventional pitch estimation procedure. The adaptive filtering step uses the first reflection coefficient (k₁) to realize a simple filter (e.g., A(z)=(1-k₁ z^-1)^-1. This filter removes high frequency noise from the residual signal during voiced periods, but does not remove the high frequency energy which contains important information during the unvoiced periods of speech. Preferably the above preprocessing technique is also combined with a postprocessing technique, wherein dynamic programming is used to optimally track pitch and voicing information through successive frames.

68 Citations

View as Search Results

20 Claims

1. A voice messaging system for encoding and regenerating human speech comprising:
- LPC analysis means for analyzing an analog speech signal provided as an input thereto in accordance with an LPC (Linear Predictive Coding) model, said LPC analysis means providing LPC parameters and a residual signal as an output representative of the analog speech signal;
  
  adaptive filter means operably coupled to the output of said LPC analysis means for receiving said residual signal and at least one LPC parameter from said LPC analysis means, said adaptive filter means filtering said residual signal in accordance with a time-varying filter characteristic defined by said at least one LPC parameter, wherein the time-varying filter characteristic provides for the removal of high frequency noise from the residual signal during periods of voiced speech and for the retention of high frequency energy in the residual signal during periods of unvoiced speech, to provide an adaptively filtered residual signal as an output therefrom;
  
  means operably connected to the output of said adaptive filter means for extracting pitch and voicing information from said adaptively filtered residual signal; and
  
  means operably connected to the outputs of said extracting means and said LPC analysis means for encoding said pitch and voicing information and said LPC parameters.
- View Dependent Claims (2, 3, 4)
- - 2. A system as set forth in claim 1, further including:
    - decoding means operably associated with said encoding means for decoding said pitch and voicing information and said LPC parameters;
      
      excitation means connected to receive said pitch and voicing information from said decoding means for providing an excitation function in accordance with said pitch and voicing information; and
      
      time-varying filtering means for filtering said excitation function in accordance with said LPC parameters.
  - 3. A system as set forth in claim 1, wherein the time-varying filter characteristic of said adaptive filter means is defined by the first reflection coefficient as said at least one LPC parameter provided by said LPC analysis means.
  - 4. A system as set forth in claim 1, wherein said extracting means for extracting pitch and voicing information from said adaptively filtered residual signal comprises means for determining normalized correlation values of said adaptively filtered residual signal.

5. A method for determining the pitch of human speech comprising the steps of:
- analyzing a speech signal input in accordance with an LPC (Linear Predictive Coding) model to provide LPC parameters and a residual signal;
  
  adaptively filtering said residual signal in accordance with a time-varying filtering characteristic as defined by at least one of said LPC parameters provided by the analyzing of said speech signal input, wherein the time-varying filtering characteristic provides for the removal of high frequency noise from the residual signal during periods of voiced speech and for the retention of higher frequency energy in the residual signal during periods of unvoiced speech, to provide an adaptively filtered residual signal; and
  
  extracting pitch period candidates from said adaptively filtered residual signal.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 6. A method as set forth in claim 5, wherein the adaptive filtering of said residual signal is accomplished by employing a time-varying filtering characteristic defined by the first reflection coefficient corresponding to said at least one of said LPC parameters as provided by the analysing of said speech signal input.
  - 7. A method as set forth in claim 5, wherein the extracting of pitch period candidates from said adaptively filtered residual signal comprises extracting normalized correlation values of said adaptively filtered residual signal.
  - 8. A method as set forth in claim 5, wherein the adaptive filtering of said residual signal is accomplished by a single-pole filter.
  - 9. A method as seth forth in claim 5, wherein said LPC parameters as provided by the analyzing of said speech signal input are reflection coefficients.
  - 10. A method as set forth in claim 6, wherein said LPC parameters as provided by the analyzing of said speech signal input are reflection coefficients.
  - 11. A method as set forth in claim 5, wherein said LPC parameters are provided by the analyzing of said speech signal input by calculating said LPC parameters in a sequence of frames at a predetermined frame rate, and wherein said speech signal input is received for analysis at a sample rate much higher than said predetermined frame rate.
  - 12. A method as set forth in claim 11, wherein the extracting of pitch period candidates from said adaptively filtered residual signal is accomplished such that said pitch period candidates are extracted at said frame rate.
  - 13. A method as set forth in claim 5, further including:
    - extracting an optimal pitch period candidate from among said pitch period candidates.
  - 14. A method as set forth in claim 13, wherein the extracting of an optimal pitch period candidate is accomplished via dynamic programming for finding a pitch period which is optimal in the context of pitch period candidates in adjacent frames.
  - 15. A method as set forth in claim 13, further including:
    - performing dynamic programming with respect both to said pitch period candidates for each frame and also to a voiced/unvoiced decision for each frame to determine both an optimal pitch period and an optimal voicing decision for each frame in the context of said sequence of frames; and
      
      determining an optimal pitch and voicing decision for each said frame in accordance with said dynamic programming performance.

16. A method for determining the pitch of human speech, comprising the steps of:
- receiving an input speech signal at a sample rate;
  
  analyzing said input speech signal according to an LPC (Linear Predictive Coding) model to provide LPC parameters and a residual signal, wherein said LPC parameters are calculated in a sequence of frames at a predetermined frame rate, and wherein the sample rate at which said input speech signal is received is much higher than said frame rate;
  
  adaptively filtering said residual signal by a filter having a time-varying filtering characteristic defined by at least one of said LPC parameters provided by said LPC analyzing step, wherein the time-varying filtering characteristic provides for the removal of high frequency noise from the residual signal during periods of voiced speech and for the retention of high frequency energy in the residual signal during periods of unvoiced speech, to provide an adaptively filtered residual signal;
  
  extracting pitch period candidates from said adaptively filtered residual signal;
  
  performing dynamic programming with respect both to said pitch period candidates for each frame and also to a voiced/unvoiced decision for each frame to determine both an optimal pitch period and an optimal voicing decision for each frame, in the context of said sequence of frames, said dynamic programming step defining a transition error between each period candidate of the current frame and each candidate of the preceding frame and wherein a cumulative error is defined for each pitch period candidate in the current frame which is equal to the transition error between said pitch period candidate of said current frame plus the cumulative error at an optimally identified pitch period candidate in the preceding frame chosen from among said pitch period candidates in said preceding frame such that the cumulative error of said corresponding pitch period candidate in said current frame is at a minimum; and
  
  determining an optimal pitch and voicing decision for each said frame in accordance with said dynamic programming performance.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The method of claim 16, wherein said transition error includes a pitch deviation error, said pitch deviation error corresponding to the difference in pitch between said pitch period candidate in said current frame and said corresponding pitch period candidate in said previous frame if both said frames are voiced.
  - 18. The method of claim 17, wherein said pitch deviation error is set at a constant if at least one of said frames is unvoiced.
  - 19. The method of claim 16, wherein said transition error also includes a voicing transition error component, said voicing transistion error component being defined to be a small predetermined value when said current frame and said previous frame are both identically voiced or both identically unvoiced, and otherwise being defined to be a decreasing function of the spectral difference between said current frame and said previous frame.
  - 20. The method of claim 16, wherein said transition error further comprises a voicing state error, said voicing state error corresponding to the degree to which said speech signal within said current frame is correlated at the period of said pitch period candidate.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Texas Instruments, Inc.
Original Assignee
Texas Instruments, Inc.
Inventors
Secrest, Bruce G., Doddington, George R.
Primary Examiner(s)
Kemeny, Emanuel S.

Application Number

US06/484,711
Time in Patent Office

1,798 Days
Field of Search

381/29, 381/30, 381/31, 381/32, 381/33, 381/34, 381/35, 381/36, 381/37, 381/38, 381/39, 381/40, 381/41, 381/42, 381/43, 381/44, 381/45, 381/46, 381/47, 381/48, 381/49, 381/50, 381/51, 364/513, 364/513.5
US Class Current

704/207
CPC Class Codes

G10L 19/06 Determination or coding of ...

G10L 25/90 Pitch determination of spee...

Voice messaging system with pitch tracking based on adaptively filtered LPC residual signal

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

68 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Voice messaging system with pitch tracking based on adaptively filtered LPC residual signal

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

68 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others