Method and apparatus for improving the intelligibility of digitally compressed speech
First Claim
1. A method for processing a speech signal comprising the steps of:
- receiving a speech signal to be processed;
dividing said speech signal into multiple frames;
analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and
modifying a sound parameter of at least one of said frame and another frame based on said spoken sound type;
wherein said step of modifying at least one of said frame and another frame includes reducing an amplitude of a previous frame when said frame is determined to comprise a voiced or unvoiced plosive.
24 Assignments
0 Petitions
Accused Products
Abstract
A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.
92 Citations
35 Claims
-
1. A method for processing a speech signal comprising the steps of:
-
receiving a speech signal to be processed;
dividing said speech signal into multiple frames;
analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and
modifying a sound parameter of at least one of said frame and another frame based on said spoken sound type;
wherein said step of modifying at least one of said frame and another frame includes reducing an amplitude of a previous frame when said frame is determined to comprise a voiced or unvoiced plosive. - View Dependent Claims (2, 4, 5, 6, 7, 8)
-
-
3. The method in clam 2, wherein:
said step of analyzing includes examining said spectral content of said frame to determine whether said frame includes a voiced or unvoiced plosive.
-
9. A method for processing a speech signal comprising the steps of:
-
providing a speech signal that is divided into time-based frames;
analyzing each frame of said frames in the context of surrounding frames to determine a spoken sound type associated with said frame; and
adjusting an amplitude of selected frames based on a result of said step of analyzing;
wherein said step of adjusting includes decreasing the amplitude of a second frame that precedes said frame when said frame is determined to include a voiced or unvoiced plosive. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system for processing a speech signal comprising:
-
means for receiving a speech signal that is divided into time-based frames;
means for determining a spoken sound type associated with each of said frames; and
means for modifying a sound parameter of selected frames based on spoken sound type to enhance signal intelligibility;
wherein said means for modifying includes a means for reducing the amplitude of a frame that precedes a frame that comprises a voiced or unvoiced plosive. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A method for processing a speech signal comprising the steps of:
-
receiving a speech signal to be processed;
dividing said speech signal into multiple frames;
analyzing a frame generated in said dividing step to determine a spoken sound type associated with said frame; and
modifying a sound parameter of said frame and another frame based on said spoken sound type;
wherein said step of modifying said frame and said another frame includes reducing an amplitude of a previous frame when said spoken sound type is an unvoiced plosive.
-
-
27. A method for processing a speech signal comprising the steps of:
-
providing a speech signal that is divided into time-based frames;
analyzing each frame of said frames in the context of surrounding frames to determine a spoken sound type associated with said frame; and
adjusting an amplitude of selected frames based on result of said step of analyzing;
wherein said step of adjusting includes decreasing the amplitude of a second frame that is previous to said frame when said spoken sound type associated with said frame includes a voiced or unvoiced plosive.
-
-
28. A system for processing a speech signal comprising:
-
means for receiving a speech signal that is divided into time-based frames;
means for determining a spoken sound type associated with each of said frames; and
means for modifying a sound parameter of selected frames based on spoken sound type to enhance signal intelligibility;
wherein said means for modifying includes means for reducing the amplitude of a frame that precedes a frame that includes an unvoiced plosive.
-
-
29. A method for processing a speech signal comprising the steps of:
-
receiving a speech signal to be processed;
dividing said speech signal into multiple frames;
analyzing a frame generated in said dividing step to determine a fricative sound type associated with said frame; and
boosting an amplitude of said frame when said frame comprises an unvoiced fricative sound type but not boosting the amplitude of said frame when said frame comprises a voiced fricative. - View Dependent Claims (30, 31, 32, 33, 34, 35)
-
Specification