Method and apparatus for performing speech frame encoding mode selection in a variable rate encoding system
First Claim
1. An apparatus for selecting an encoding rate from a predetermined set of encoding rates and for encoding a frame of speech including a plurality of speech samples, comprising:
- means, responsive to said speech samples and to at least one signal derived from said speech samples, for generating a set of parameters indicative of characteristics of said frame of speech; and
means for receiving said set of parameters, for determining the psychoacoustic significance of said speech samples in accordance with said set of parameters, and for selecting an encoding rate from said predetermined set of encoding rates using predetermined rate selection rules.
0 Assignments
0 Petitions
Accused Products
Abstract
It is an objective of the present invention to provide an optimized method of selection of the encoding mode that provides rate efficient coding of the input speech. It is a second objective of the present invention to identify and provide a means for generating a set of parameters ideally suited for this operational mode selection. Third, it is an objective of the present invention to provide identification of two separate conditions that allow low rate coding with minimal sacrifice to quality. The two conditions are the coding of unvoiced speech and the coding of temporally masked speech. It is a fourth objective of the present invention to provide a method for dynamically adjusting the average output data rate of the speech coder with minimal impact on speech quality.
86 Citations
10 Claims
-
1. An apparatus for selecting an encoding rate from a predetermined set of encoding rates and for encoding a frame of speech including a plurality of speech samples, comprising:
-
means, responsive to said speech samples and to at least one signal derived from said speech samples, for generating a set of parameters indicative of characteristics of said frame of speech; and
means for receiving said set of parameters, for determining the psychoacoustic significance of said speech samples in accordance with said set of parameters, and for selecting an encoding rate from said predetermined set of encoding rates using predetermined rate selection rules.
-
-
2. An apparatus for selecting an encoding rate from a predetermined set of encoding rates and for encoding a frame of speech including a plurality of speech samples, comprising:
-
a mode measurement calculator that generates a set of parameters indicative of characteristics of said frame of speech in accordance with said speech samples and a signal derived from said speech samples; and
a rate determination logic for receiving said set of parameters, for determining the psychoacoustic significance of said speech samples in accordance with said set of parameters, and for selecting an encoding rate from said predetermined set of encoding rates.
-
-
3. In a communication system wherein a remote station communicates with a central communication center, a subsystem for dynamically changing the transmission rate of a frame of speech transmitting from said remote station, comprising:
-
means, responsive to said speech frame and to a signal derived from said speech frame, for generating a set of parameters indicative of characteristics of said speech frame; and
means for receiving said set of parameters, for determining the pyschoacoustic significance of said speech samples in accordance with said set of parameters, for receiving a rate command signal for generating at least one threshold value in accordance with said rate command signal, for comparing at least one parameter of said set of parameters with said at least one threshold value, and for selecting an encoding rate in accordance with said comparison.
-
-
4. In a communication system wherein a remote station communicates with a central communication center, a subsystem for dynamically changing the transmission rate of a frame of speech transmitting from said remote station, comprising:
-
a mode measurement calculator that generates a set of parameters indicative of characteristics of said frame of speech in accordance with said speech samples and a signal derived from said speech samples; and
a rate determination logic that receives said set of parameters for determining the psychoacoustic significance of said speech samples in accordance with said set of parameters, receives a rate command signal for generating at least one threshold value in accordance with said rate command signal, compares at least one parameter of said set of parameters with said at least one threshold value, and selects an encoding rate in accordance with said comparison.
-
-
5. A method for selecting an encoding rate of a predetermined set of encoding rates for encoding a frame of speech including a plurality of speech samples, comprising:
-
generating a set of parameters indicative of characteristics of said frame of speech in accordance with said speech samples and with a signal derived from said speech samples; and
selecting an encoding rate from said predetermined set of encoding rates in accordance with said set of parameters, said set of parameters for determining the psychoacoustic significance of said speech samples.
-
-
6. A method for adjusting the average data rate of a variable rate encoder that encodes speech frames based on how well a speech model tracks the speech frames as determined by information from a target matching signal to noise ratio (TMSNR) element communicatively coupled to the variable rate encoder, the method comprising:
-
increasing a threshold value for an output of the TMSNR element, wherein if the output of the TMSNR element does not exceed the increased threshold value then the average data rate of the speech frames will be increased by the variable rate encoder; and
decreasing the threshold value for the output of the TMSNR element, wherein if the output of the TMSNR element exceeds the decreased threshold value then the average data rate of the speech frames will be decreased by the variable rate encoder. - View Dependent Claims (7, 8, 9, 10)
estimating the number of speech frames that needs to be encoded at a full rate rather than a half rate to increase the average data rate of the speech frames.
-
-
8. The method of claim 7, wherein estimating the number of speech frames comprises using a histogram containing a plurality of differences between possible output values of the TMSNR element and a current value of the threshold value are stored, wherein the plurality of differences are used to determine how many speech frames need to be encoded at the half rate.
-
9. The method of claim 6, further comprising:
estimating the number of speech frames that needs to be encoded at a half rate rather than a full rate to decrease the average data rate of the speech frames.
-
10. The method of claim 9, wherein estimating the number of speech frames comprises using a histogram containing a plurality of differences between possible output values of the TMSNR element and a current value of the threshold value are stored, wherein the plurality of differences are used to determine how many speech frames need to be encoded at the full rate.
Specification