Method and apparatus for enhancement of telephonic speech signals
First Claim
1. A method for processing a telephone speech signal, comprising the steps of:
- a) transforming a digital representation of the speech signal into an auditory spectrum;
b) identifying regions within the auditory spectrum of strong first and second formants;
c) enhancing identified second formants relative to their respective first formants;
d) identifying consonant regions within the auditory spectrum;
e) amplifying the identified consonant regions, the amplification of the consonant regions increasing the consonant/vowel intensity ratio, the enhancement of the second formants and the amplification of the consonant regions producing a modified auditory spectrum;
f) mapping the modified auditory spectrum to a Fourier spectrum;
g) converting the Fourier spectrum to the time domain using an inverse fast-Fourier transform; and
h) normalizing the converted Fourier spectrum to provide a digital representation of a processed speech signal having more energy in regions of the second formants and the consonants.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for enhancing the intelligibility of a telephonic speech signal within the available bandwidth and intensity limits of a telephone communication network. The method combines enhancement of both the formant ratio and the consonant/vowel energy ratio to realize a speech signal more intelligible to a hearing impaired user. The invention uses an auditory model of the human ear. A speech signal is put through a filter bank designed to simulate the cochlear filter shapes and filter spacing of a healthy cochlea. The energy output from each of a plurality of filters is computed and used to form an auditory spectrum. The peaks associated with strong first and second formants are identified, and the second formant is enhanced relative to the first formant by attenuating the first formant. Also, consonants in the speech signal are identified as having an energy level below a threshold associated with vowels, but above the threshold associated with silent regions. Consonant regions are amplified. The net effect is to provide more energy in regions of the second formant and the consonants to enhance the intelligibility of the speech signal.
60 Citations
13 Claims
-
1. A method for processing a telephone speech signal, comprising the steps of:
-
a) transforming a digital representation of the speech signal into an auditory spectrum; b) identifying regions within the auditory spectrum of strong first and second formants; c) enhancing identified second formants relative to their respective first formants; d) identifying consonant regions within the auditory spectrum; e) amplifying the identified consonant regions, the amplification of the consonant regions increasing the consonant/vowel intensity ratio, the enhancement of the second formants and the amplification of the consonant regions producing a modified auditory spectrum; f) mapping the modified auditory spectrum to a Fourier spectrum; g) converting the Fourier spectrum to the time domain using an inverse fast-Fourier transform; and h) normalizing the converted Fourier spectrum to provide a digital representation of a processed speech signal having more energy in regions of the second formants and the consonants. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for processing a telephone speech signal, the system comprising:
-
transforming means for transforming a digital representation of the speech signal into an auditory spectrum; formant identification means for identifying regions within the auditory spectrum of strong first and second formants; enhancement means for enhancing identified second formants relative to their respective first formants; consonant identification means for identifying consonant regions within the auditory spectrum; amplification means for amplifying the identified consonant regions to increase the consonant/vowel intensity ratio, the enhancement of the second formants and the amplification of the consonant regions producing a modified auditory spectrum; mapping means for mapping the modified auditory spectrum to a Fourier spectrum; converting means for converting the Fourier spectrum to the time domain using an inverse fast-Fourier transform; and normalization means for normalizing the converted Fourier spectrum to provide a digital representation of a processed speech signal. - View Dependent Claims (10, 11, 12, 13)
-
Specification