Enhancement of speech coding in background noise for low-rate speech coder

US 5,680,508 A
Filed: 05/12/1993
Issued: 10/21/1997
Est. Priority Date: 05/03/1991
Status: Expired due to Term

First Claim

Patent Images

1. In a method of low-bit-rate speech coding of input speech occurring in a noisy environment, for a system which employs linear predictive coding (LPC) analysis of input speech frames to generate reflection coefficients, conversion of the reflection coefficients to vectors representing spectral parameters of the input speech frames, and matching of the spectral parameter vectors against reference vectors of a vocabulary of codewords generated in a training sequence in order to select the corresponding index of an optimally matching codeword for transmission,the improvement comprising the steps of:

selecting a set of at least two features which are characterized by a probability distribution which is not strongly affected in the noisy environment and which allow discrimination between voiced and unvoiced input speech, wherein said selected features include the feature of zero-crossing counts which are based on average noise energy;

measuring the selected features for input speech frames; and

using said feature measurements to make voiced/unvoiced speech decisions in order to select the voice/unvoiced excitation for speech synthesis in the receiver;

using noise estimates to update the reference vectors of the vocabulary of codewords, wherein new reference vectors are generated corresponding to said vocabulary of codewords in the noisy environment, said noise estimates including noise amplitude and noise reflection coefficients, wherein said noise estimate for speech frame I is performed only if the ith speech frame is unvoiced and more than a given number L of continuous unvoiced speech frames are accumulated, in order to prevent using voiced or unvoiced speech in the noise estimate.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the "noisy" vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end. The results are better spectral reproduction and significant intelligibility enhancement over prior coding approaches. Robust features found to allow robust voicing decisions include: low-band energy; zero-crossing counts adapted for noise level; AMDF ratio (speech periodicity) measure; low-pass filtered backward correlation; low-pass filtered forward correlation; inverse-filtered backward correlation; and inverse-filtered pitch prediction gain measure.

191 Citations

13 Claims

1. In a method of low-bit-rate speech coding of input speech occurring in a noisy environment, for a system which employs linear predictive coding (LPC) analysis of input speech frames to generate reflection coefficients, conversion of the reflection coefficients to vectors representing spectral parameters of the input speech frames, and matching of the spectral parameter vectors against reference vectors of a vocabulary of codewords generated in a training sequence in order to select the corresponding index of an optimally matching codeword for transmission,the improvement comprising the steps of:
- selecting a set of at least two features which are characterized by a probability distribution which is not strongly affected in the noisy environment and which allow discrimination between voiced and unvoiced input speech, wherein said selected features include the feature of zero-crossing counts which are based on average noise energy;
  
  measuring the selected features for input speech frames; and
  
  using said feature measurements to make voiced/unvoiced speech decisions in order to select the voice/unvoiced excitation for speech synthesis in the receiver;
  
  using noise estimates to update the reference vectors of the vocabulary of codewords, wherein new reference vectors are generated corresponding to said vocabulary of codewords in the noisy environment, said noise estimates including noise amplitude and noise reflection coefficients, wherein said noise estimate for speech frame I is performed only if the ith speech frame is unvoiced and more than a given number L of continuous unvoiced speech frames are accumulated, in order to prevent using voiced or unvoiced speech in the noise estimate.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A low-bit-rate speech coding method according to claim 1, wherein said voicing decision step includes the substep of determining a linear combination of said features which provides a high voiced/unvoiced discrimination capability;
    - and determining respective weights to be applied to said features in order to obtain an optimal linear combination of said features.
  - 3. A low-bit-rate speech coding method according to claim 2, wherein said weights determining substep of said voicing decision step is performed using the simplex method for obtaining a maximum quantity h for an average distance between voiced and unvoiced regions of the input speech.
  - 4. A low-bit-rate speech coding method according to claim 1, wherein said selected features include the feature of low-band energy.
  - 5. A low-bit-rate speech coding method according to claim 1, wherein said selected features include an AMDF ratio (speech periodicity) measure.
  - 6. A low-bit-rate speech coding method according to claim 1, wherein said selected features include a backward correlations measure responsive to low-pass-filtered speech energy.
  - 7. A low-bit-rate speech coding method according to claim 1, wherein said selected features include a forward correlations measure responsive to low-pass-filtered speech energy.
  - 8. A low-bit-rate speech coding method according to claim 1, wherein said selected features include a backward correlations measure responsive to inverse-filtered speech energy.
  - 9. A low-bit-rate speech coding method according to claim 1, wherein said selected features include a pitch prediction gain measure responsive to inverse-filtered speech energy.
  - 10. A low-bit-rate speech coding method according to claim 1, adapted for the environment of helicopter noise, and further comprising the step of low-pass filtering of speech energy at a cutoff frequency of about 420 Hz.
  - 11. A low-bit-rate speech coding method according to claim 10, wherein said LPC analysis is conducted as 14th-order LPC analysis.

12. In a method of low-bit-rate speech coding of input speech occurring in a noisy environment, for a system which employs linear predictive coding (LPC) analysis of input speech frames to generate reflection coefficients, conversion of the reflection coefficients to vectors representing spectral parameters of the input speech frames, and matching of the spectral parameter vectors against reference vectors of a vocabulary of codewords generated in a training sequence in order to select the corresponding index of an optimally matching codeword for transmission,the improvement comprising the steps of:
- selecting a set of features which are characterized by a probability distribution which is not strongly affected in the noisy environment and which allow discrimination between voiced and unvoiced input speech;
  
  measuring the selected features for input speech frames; and
  
  using said feature measurements to make voiced/unvoiced speech decisions in order to select the voice/unvoiced excitation for speech synthesis in the receiver;
  
  using noise estimates to update the reference vectors of the vocabulary of codewords, wherein new reference vectors are generated corresponding to said vocabulary of codewords in the noisy environment, said noise estimates including noise amplitude and noise reflection coefficients, wherein said noise estimate for speech frame I is performed only if the ith speech frame is unvoiced and more than a given number L of continuous unvoiced speech frames are accumulated, in order to prevent using voiced or unvoiced speech in the noise estimate.
- View Dependent Claims (13)
- - 13. A low-bit-rate speech coding method according to claim 12, wherein the vocabulary of codewords is generated for speech in a quiet environment, said quiet environment vocabulary is updated with noise estimates to obtain a vocabulary of codewords corresponding to the noisy environment, said noisy environment vocabulary constituting said reference vectors against which said spectral parameter vectors are matched, and speech is synthesized at a receiver end of the speech coding system using said quiet environment vocabulary.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Exelis Incorporated (L3Harris Technologies, Inc.)
Original Assignee
ITT Corporation (ITT, Inc.)
Inventors
Liu, Yu-Jih
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
MATTSON, ROBERT

Application Number

US08/060,710
Time in Patent Office

1,623 Days
Field of Search

395/2.22, 395/2.23, 395/2.36, 395/2.28, 395/2.16, 395/2.17, 395/2.35
US Class Current

704/227
CPC Class Codes

G10L 19/04   using predictive techniques

G10L 21/0264   characterised by the type o...

G10L 25/09   the extracted parameters be...

G10L 25/93   Discriminating between voic...

Enhancement of speech coding in background noise for low-rate speech coder

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

191 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Enhancement of speech coding in background noise for low-rate speech coder

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

191 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links