Voiced, unvoiced or noise modes in a CELP vocoder

US 5,734,789 A
Filed: 04/18/1994
Issued: 03/31/1998
Est. Priority Date: 06/01/1992
Status: Expired due to Term

First Claim

Patent Images

1. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:

measuring a value for at least one speech characteristic of a frame, wherein the speech characteristic is selected from the group consisting of spectral stationarity, pitch stationarity, high-frequency content, and energy;

comparing the measured value of the selected speech characteristic with at least two thresholds, including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic; and

setting a first flag if the measured value exceeds the high threshold; and

setting a second flag if the measured energy value is below the low threshold;

determining whether the frame lacks a substantial speech component based on the determined flags;

classifying the frame in a noise mode if the frame lacks a substantial speech component, and in a speech mode otherwise; and

generating an encoded frame in accordance with a noise mode coding scheme if the frame is classified in the noise mode, and in accordance with a speech coding scheme if the frame is classified in the speech mode.

View all claims

15 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A bit rate Codebook Excited Linear Predictor (CELP) communication system which includes a transmitter that organizes a signal containing speech into frames of 40 millisecond duration, and classifies each frame as one of three modes: voiced and stationary, unvoiced or transient, and background noise.

Citations

24 Claims

1. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
- measuring a value for at least one speech characteristic of a frame, wherein the speech characteristic is selected from the group consisting of spectral stationarity, pitch stationarity, high-frequency content, and energy;
  
  comparing the measured value of the selected speech characteristic with at least two thresholds, including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic; and
  
  setting a first flag if the measured value exceeds the high threshold; and
  
  setting a second flag if the measured energy value is below the low threshold;
  
  determining whether the frame lacks a substantial speech component based on the determined flags;
  
  classifying the frame in a noise mode if the frame lacks a substantial speech component, and in a speech mode otherwise; and
  
  generating an encoded frame in accordance with a noise mode coding scheme if the frame is classified in the noise mode, and in accordance with a speech coding scheme if the frame is classified in the speech mode.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 12)
- - 2. The method of claim 1, wherein a first speech characteristic measured is energy,wherein the first flag is a first energy flag and the second flag is a second energy flag;
    - andwherein the frame is determined to lack a substantial speech component if the second energy flag is set, and is determined to contain a substantial speech component if the first energy flag is set.
  - 3. The method of claim 2, wherein a second speech characteristic measured is spectral stationarity, and the method further comprises the steps of:
    - comparing the measured energy with at least two intermediate thresholds representing energy values between the high energy value and the low energy value, the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate threshold;
      
      setting a third energy flag if the measured energy is below the first intermediate threshold;
      
      setting a fourth energy flag if the measured energy is below the second intermediate threshold;
      
      measuring a spectral stationarity for the frame;
      
      setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity;
      
      setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity,wherein the frame is determined to lack a substantial speech component ifthe first spectral stationarity flag is set and the third energy flag is set;
      
      orthe second spectral stationarity flag is set and the fourth energy flag is set.
  - 4. The method of claim 3, wherein the step of measuring a spectral stationarity of the frame includes the substeps of:
    - determining a first set of filter coefficients corresponding to the frame and a second set of filter coefficients corresponding to a previous frame;
      
      determining a cepstral distortion and a residual energy for the frame based on the determined first and second sets of filter coefficients, wherein the spectral stationarity measurement is based on the cepstral distortion and residual energy determinations.
  - 5. The method of claim 1, wherein a first characteristic measured is spectral stationarity, a second characteristic measured is pitch stationarity, and a third characteristic measured is high-frequency content, further comprises the steps of:
    - measuring a spectral stationarity for the frame;
      
      setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity;
      
      setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity;
      
      measuring a pitch stationarity for the frame;
      
      setting a first pitch stationarity flag if the pitch stationarity measurement strongly indicates pitch stationarity;
      
      setting a second pitch stationarity flag if the pitch stationarity measurement weakly indicates pitch stationarity;
      
      measuring a high-frequency content of the frame;
      
      setting a first high-frequency flag if the high-frequency measurement strongly indicates high-frequency content; and
      
      setting a second high-frequency flag if the high-frequency measurement indicates a lack of high-frequency content.
  - 6. The method of claim 5, wherein the frame is determined to lack a substantial speech component if the second spectral stationarity flag is set, the first pitch stationarity flag is not set, the second pitch stationarity flag is not set, and the first high-frequency flag is set.
  - 7. The method of claim 5, wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set, the first pitch stationarity flag is not set, and the first high-frequency flag is set.
  - 8. The method of claim 1, wherein the step of classifying is followed by the step of updating at least one of the thresholds if the frame is classified in the noise mode.
  - 12. The encoding method of claim 1, wherein the step of measuring a spectral stationarity of the frame further comprises the steps of:
    - determining a first set of filter coefficients corresponding to the frame and a second set of filter coefficients corresponding to a previous frame; and
      
      determining a cepstral distortion and a residual energy for the frame based on the determined first and second sets of filter coefficients, wherein the spectral stationarity measurement is based on the cepstral distortion and residual energy determinations.

9. A method of encoding a signal having a speech component, the signal being organized as a plurality of frames, comprising the steps of:
- measuring a value for at least one speech characteristic of a frame, wherein the speech characteristic is selected from the group consisting of spectral stationarity, pitch stationarity, high-frequency content, and energy;
  
  comparing the measured value of the selected speech characteristic with at least two thresholds, including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic;
  
  setting a first flag if the measured value exceeds the high threshold; and
  
  setting a second flag if the measured value is below the low threshold;
  
  determining whether the frame lacks a substantial speech component based on the determined flags;
  
  classifying the frame in a noise mode, depending on whether the frame lacks a substantial speech component, and in a speech mode otherwise; and
  
  generating an encoded frame in accordance with a noise coding scheme when the frame is classified in the noise mode, and in accordance with a speech coding scheme when the frame is classified in the speech mode.
- View Dependent Claims (10, 11, 13, 14, 15, 16)
- - 10. The encoding method of claim 9, wherein a first characteristic measured is energy,wherein the first flag is a first energy flag and the second flag is a second energy flag;
    - andwherein the frame is determined to lack a substantial speech component if the second energy flag is set, and is determined to contain a substantial speech component if the first energy flag is set.
  - 11. The encoding method of claim 10, wherein a second characteristic measured is spectral stationarity, and the method further comprises:
    - comparing the measured energy with at least two intermediate thresholds representing energy values falling between the high energy value and the low energy value, the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate threshold;
      
      setting a third energy flag if the measured energy is below the first intermediate threshold;
      
      setting a fourth energy flag if the measured energy is below the second intermediate threshold;
      
      measuring a spectral stationarity for the frame;
      
      setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity;
      
      setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity,wherein the frame is determined to lack a substantial speech component ifthe first spectral stationarity flag is set and the third energy flag is set;
      
      orthe second spectral stationarity flag is set and the fourth energy flag is set.
  - 13. The encoding method of claim 10, further comprising the step of updating at least one of the thresholds if the frame is classified in the noise mode.
  - 14. The encoding method of claim 9, wherein a first characteristic measured is spectral stationarity, a second characteristic measured is pitch stationarity, and a third characteristic measured is high-frequency content, further comprises the steps of:
    - measuring a spectral stationarity for the frame;
      
      setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity;
      
      setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity;
      
      measuring a pitch stationarity for the frame;
      
      setting a first pitch stationarity flag if the pitch stationarity measurement strongly indicates pitch stationarity;
      
      setting a second pitch stationarity flag if the pitch stationarity measurement weakly indicates pitch stationarity;
      
      measuring a high-frequency content of the frame;
      
      setting a first high-frequency flag if the high-frequency measurement strongly indicates high-frequency content; and
      
      setting a second high-frequency flag if the high-frequency measurement indicates a lack of high-frequency content.
  - 15. The encoding method of claim 14, wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set and the first pitch stationarity flag is not set and the first high-frequency flag is set.
  - 16. The encoding method of claim 14, wherein the frame is determined to lack a substantial speech component if the second spectral stationarity flag is set, the first pitch stationarity flag is not set, the second pitch stationarity flag is not set, and the first high-frequency flag is set.

17. An encoder for encoding a signal having a speech component, the signal being organized as a plurality of frames, comprising:
- means for measuring a value for at least one speech characteristic of a frame from among the plurality of frames, wherein the speech characteristic is selected from the group consisting of spectral stationarity, pitch stationarity, high-frequency content, and energy;
  
  a speech characteristic value measurer for comparing the measured value of the selected speech characteristic with at least two thresholds, including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic, setting a first flag if the measured value exceeds the high threshold, and setting a second flag if the measured value falls below the low threshold;
  
  means for determining whether the frame lacks a substantial speech component based on an evaluation of the determined flags;
  
  a mode classifier for classifying the frame in a noise mode if the frame lacks a substantial speech component, and in a speech mode otherwise; and
  
  a frame encoder for generating an encoded frame in accordance with a noise mode coding scheme when the frame is classified in the noise mode, and in accordance with a speech coding scheme when the frame is classified in the speech mode.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
- - 18. The encoder of claim 17, wherein a first characteristic measured is energy and the measurement means further comprisesan energy measurer for comparing the measured energy with at least two thresholds wherein the frame is determined to lack a substantial speech component if the second energy flag is set, and is determined to contain a substantial speech component if the first energy flag is set.
  - 19. The encoder of claim 18, further comprising:
    - a spectral stationarity measurer for measuring a spectral stationarity for the frame, setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity, and setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity,wherein the energy measurer further compares the measured energy with at least two intermediate thresholds representing energy values falling between the high energy value and the low energy value, the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate threshold, andwherein the frame is determined to lack a substantial speech component if;
      
      the first spectral stationarity flag is set and the third energy flag is set;
      
      orthe second spectral stationarity flag is set and the fourth energy flag is set.
  - 20. The encoder of claim 24, wherein the spectral stationarity measurer determines a first set of filter coefficients corresponding to the frame and a second set of filter coefficients corresponding to a previous signal frame, and determines a cepstral distortion and a residual energy for the frame based on the determined first and second sets of filter coefficients, wherein the spectral stationarity measurement is based on the cepstral distortion and residual energy determinations.
  - 21. The encoder of claim 18 further comprising a controller for updating at least one of the thresholds if the frame is classified in the noise mode.
  - 22. The encoder of claim 17, wherein a first characteristic measured is spectral stationarity, a second characteristic measured is pitch stationarity, and a third characteristic measured is high-frequency content, wherein the measuring means further comprises:
    - a spectral stationarity measurer for measuring a spectral stationarity for the frame, setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity, and setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity;
      
      a pitch stationarity measurer for measuring a pitch stationarity for the frame, setting a first pitch stationarity flag if the pitch stationarity measurement strongly indicates pitch stationarity, and setting a second pitch stationarity flag if the pitch stationarity measurement weakly indicates pitch stationarity;
      
      a high-frequency content measurer for measuring a high-frequency content of the frame, setting a first high-frequency flag if the high-frequency measurement strongly indicates high-frequency content, and setting a second high-frequency flag if the high-frequency measurement indicates a lack of high-frequency content.
  - 23. The encoder of claim 17, wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set and the first pitch stationarity flag is not set and the first high-frequency flag is set.
  - 24. The encoder of claim 17, wherein the frame is determined to lack a substantial speech component if the second spectral stationarity flag is set, the first pitch stationarity flag is not set, the second pitch stationarity flag is not set, and the first high-frequency flag is set.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hughes Network Systems LLC (Echostar Corporation)
Original Assignee
Hughes Electronics Corporation (AT&T, Inc.)
Inventors
Swaminathan, Kumar, Ganesan, Kalyan, Gupta, Prabhat K.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Wieland, Susan

Application Number

US08/229,271
Time in Patent Office

1,443 Days
Field of Search

395/2.1-2.32, 381/29-40
US Class Current

704/206
CPC Class Codes

G10L 19/012   Comfort noise or silence co...

G10L 19/12   the excitation function bei...

G10L 19/26   Pre-filtering or post-filte...

G10L 2019/0002   Codebook adaptations

G10L 2019/0003   Backward prediction of gain

G10L 25/09   the extracted parameters be...

G10L 25/18   the extracted parameters be...

G10L 25/24   the extracted parameters be...

G10L 25/90   Pitch determination of spee...

G10L 25/93   Discriminating between voic...

Voiced, unvoiced or noise modes in a CELP vocoder

First Claim

15 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Voiced, unvoiced or noise modes in a CELP vocoder

First Claim

15 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links