Method and apparatus for improving the intelligibility of digitally compressed speech

US 6,889,186 B1
Filed: 06/01/2000
Issued: 05/03/2005
Est. Priority Date: 06/01/2000
Status: Expired due to Term

First Claim

Patent Images

1. A method for processing a speech signal comprising the steps of:

receiving a speech signal to be processed;

dividing said speech signal into multiple frames;

analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and

modifying a sound parameter of at least one of said frame and another frame based on said spoken sound type;

wherein said step of modifying at least one of said frame and another frame includes reducing an amplitude of a previous frame when said frame is determined to comprise a voiced or unvoiced plosive.

View all claims

24 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.

92 Citations

View as Search Results

35 Claims

1. A method for processing a speech signal comprising the steps of:
- receiving a speech signal to be processed;
  
  dividing said speech signal into multiple frames;
  
  analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and
  
  modifying a sound parameter of at least one of said frame and another frame based on said spoken sound type;
  
  wherein said step of modifying at least one of said frame and another frame includes reducing an amplitude of a previous frame when said frame is determined to comprise a voiced or unvoiced plosive.
- View Dependent Claims (2, 4, 5, 6, 7, 8)
- - 2. The method claimed in claim 1, wherein:
    - said step of analyzing includes performing a spectral analysis on said frame to determine a spectral content of said frame.
  - 4. The method claimed in claim 1, wherein:
    - said step of analyzing includes determining an amplitude of said frame and comparing said amplitude of said frame to an amplitude of a previous frame to determine whether said frame includes a plosive sound.
  - 5. The method claimed in claim 1, wherein:
    - said step of modifying at least one of said frame and another frame further comprises boosting an amplitude of said frame when said frame is determined to include an unvoiced plosive.
  - 6. The method claimed in claim 1, wherein:
    - said step of modifying at least one of said frame and another frame further includes changing a parameter associated with said frame in a manner that enhances intelligibility of an output signal.
  - 7. The method of claim 1, wherein:
    - said step of modifying at least one of said frame and another frame based on said spoken sound type comprises modifying said frame and said another frame.
  - 8. A computer readable medium having program instructions stored thereon for implementing the method of claim 1 when executed within a digital processing device.

3. The method in clam 2, wherein:
- said step of analyzing includes examining said spectral content of said frame to determine whether said frame includes a voiced or unvoiced plosive.

9. A method for processing a speech signal comprising the steps of:
- providing a speech signal that is divided into time-based frames;
  
  analyzing each frame of said frames in the context of surrounding frames to determine a spoken sound type associated with said frame; and
  
  adjusting an amplitude of selected frames based on a result of said step of analyzing;
  
  wherein said step of adjusting includes decreasing the amplitude of a second frame that precedes said frame when said frame is determined to include a voiced or unvoiced plosive.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The method of claim 9, wherein:
    - said step of adjusting includes adjusting the amplitude of a second frame in a manner that enhances intelligibility of an output signal.
  - 11. The method of claim 9, wherein:
    - said step of adjusting further comprises increasing the amplitude of said frame when said spoken sound type associated with said frame includes an unvoiced plosive.
  - 12. The method of claim 9, wherein:
    - said step of adjusting includes increasing the amplitude of a second frame when said spoken sound type associated with said second frame includes an unvoiced fricative.
  - 13. The method of claim 9, wherein:
    - said step of analyzing includes comparing an amplitude of a first frame to an amplitude of a frame previous to said first frame.
  - 14. A computer readable medium having program instructions stored thereto for implementing the method claimed in claim 9 when executed in a digital processing device.

15. A system for processing a speech signal comprising:
- means for receiving a speech signal that is divided into time-based frames;
  
  means for determining a spoken sound type associated with each of said frames; and
  
  means for modifying a sound parameter of selected frames based on spoken sound type to enhance signal intelligibility;
  
  wherein said means for modifying includes a means for reducing the amplitude of a frame that precedes a frame that comprises a voiced or unvoiced plosive.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 16. The system claimed in claim 15, wherein:
    - said system is implemented within a linear predictive coding (LPC) encoder.
  - 17. The system claimed in claim 15, wherein:
    - said system is implemented within a code excited linear prediction (CELP) encoder.
  - 18. The system claimed in claim 15, wherein:
    - said system is implemented within a linear predictive coding (LPC) decoder.
  - 19. The system claimed in claim 15, wherein:
    - said system is implemented within a code excited linear prediction (CELP) decoder.
  - 20. The system claimed in claim 15, wherein:
    - said means for determining includes means for performing a spectral analysis on a frame.
  - 21. The system claimed in claim 15, wherein:
    - said means for determining includes means for comparing amplitudes of adjacent frames.
  - 22. The system claimed in claim 15, wherein:
    - said means for determining includes means for ascertaining whether a frame includes a voiced or unvoiced sound.
  - 23. The system claimed in claim 15, wherein:
    - said means for modifying further includes means for boosting the amplitude of a second frame that includes a spoken sound type that is typically less intelligible than other sound types.
  - 24. The system claimed in claim 15, wherein:
    - said means for modifying further comprises means for boosting the amplitude of a frame that includes an unvoiced plosive.
  - 25. The system claimed in claim 15, wherein:
    - said means for determining a spoken sound type includes means for determining whether a frame includes at least one of the following;
      
      a vowel sound, a voiced fricative, an unvoiced fricative, a voiced plosive, and an unvoiced plosive.

26. A method for processing a speech signal comprising the steps of:
- receiving a speech signal to be processed;
  
  dividing said speech signal into multiple frames;
  
  analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and
  
  modifying a sound parameter of said frame and another frame based on said spoken sound type;
  
  wherein said step of modifying said frame and said another frame includes reducing an amplitude of a previous frame when said spoken sound type is an unvoiced plosive.

27. A method for processing a speech signal comprising the steps of:
- providing a speech signal that is divided into time-based frames;
  
  analyzing each frame of said frames in the context of surrounding frames to determine a spoken sound type associated with said frame; and
  
  adjusting an amplitude of selected frames based on result of said step of analyzing;
  
  wherein said step of adjusting includes decreasing the amplitude of a second frame that is previous to said frame when said spoken sound type associated with said frame includes a voiced or unvoiced plosive.

28. A system for processing a speech signal comprising:
- means for receiving a speech signal that is divided into time-based frames;
  
  means for determining a spoken sound type associated with each of said frames; and
  
  means for modifying a sound parameter of selected frames based on spoken sound type to enhance signal intelligibility;
  
  wherein said means for modifying includes means for reducing the amplitude of a frame that precedes a frame that includes an unvoiced plosive.

29. A method for processing a speech signal comprising the steps of:
- receiving a speech signal to be processed;
  
  dividing said speech signal into multiple frames;
  
  analyzing a frame generated in said dividing step to determine a fricative sound type associated with said frame; and
  
  boosting an amplitude of said frame when said frame comprises an unvoiced fricative sound type but not boosting the amplitude of said frame when said frame comprises a voiced fricative.
- View Dependent Claims (30, 31, 32, 33, 34, 35)
- - 30. The method of claim 29, wherein:
    - said step of analyzing includes performing a spectral analysis on said frame to determine a spectral content of said frame.
  - 31. The method claimed in claim 30, wherein:
    - said step of analyzing includes examining said spectral content of said frame to determine whether said frame includes a voiced or unvoiced fricative.
  - 32. The method of claim 29, wherein:
    - said step of analyzing includes determining an amplitude of said frame and comparing said amplitude of said frame to an amplitude of a previous frame to determine whether said frame includes a plosive sound.
  - 33. The method claimed in claim 29, wherein:
    - said step of boosting an amplitude of said frame further includes changing a parameter associated with said frame in a manner that enhances intelligibility of an output signal.
  - 34. The method claimed in claim 29, wherein:
    - said step of boosting an amplitude of said frame further comprises modifying another frame.
  - 35. A computer readable medium having program instructions stored thereon for implementing the method of claim 29 when executed within a digital processing device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avaya Incorporated
Original Assignee
Avaya Technology Corporation Miami Lakes FLA US
Inventors
Michaelis, Paul Roller
Primary Examiner(s)
Chawan, Vijay
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/586,183
Time in Patent Office

1,797 Days
Field of Search

704/225, 704/208, 704/214, 704/227, 704/254
US Class Current

704/225
CPC Class Codes

G10L 21/0264 characterised by the type o...

G10L 21/0364 for improving intelligibility

Method and apparatus for improving the intelligibility of digitally compressed speech

First Claim

24 Assignments

0 Petitions

Accused Products

Abstract

92 Citations

35 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for improving the intelligibility of digitally compressed speech

First Claim

24 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

92 Citations

35 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links