System for speech encoding having an adaptive encoding arrangement
First Claim
1. A speech encoding system comprising:
- a detector for detecting whether an input speech signal generally has a triggering characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the speech signal for a frame associated with the interval, the first encoding scheme having a pre-processing procedure for processing the inputted speech signal to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic; and
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type is different from the first frame type;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for an adaptive codebook index, 120 bits for a fixed codebook index, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain.
12 Assignments
0 Petitions
Accused Products
Abstract
In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.
-
Citations
57 Claims
-
1. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a triggering characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the speech signal for a frame associated with the interval, the first encoding scheme having a pre-processing procedure for processing the inputted speech signal to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic; and
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type is different from the first frame type;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for an adaptive codebook index, 120 bits for a fixed codebook index, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 57)
-
-
13. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a generally voiced and generally stationary characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the input speech signal for a frame associated with the interval, the second encoding scheme having long-term prediction procedure for processing the input speech signal on a sub-frame-by-subframe basis;
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon said detection or absence of the generally voiced and generally stationary characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for an adaptive codebook index, 120 bits for a fixed codebook index, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A speech encoding method comprising the steps of:
-
detecting whether an input speech signal has a triggering characteristic during an interval;
selecting one of a first encoding scheme and a second encoding scheme, for application to the input speech signal for a frame associated with the interval, based upon said detection of the triggering characteristic; and
processing the inputted speech signal in accordance with the first encoding scheme to form a revised input speech signal biased toward a generally ideal voiced and stationary characteristic if the triggering characteristic is detected in the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for said an adaptive codebook index, 120 bits for a fixed codebook index, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A speech encoding method comprising:
-
receiving a speech frame for encoding;
classifying said speech frame as a voiced speech frame if said speech frame includes a voiced speech component;
designating said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, designating said voiced speech frame as a non- stationary voiced speech frame; and
allocating a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non-stationary voice speech frame;
allocating a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
determining whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate;
using a first frame type to encode said stationary voiced speech frame if said encoding rate is said high encoding rate;
using a third frame type to encode said stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for said adaptive codebook index, 120 bits for said fixed codebook index, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain. - View Dependent Claims (34, 35)
-
-
36. A speech encoding system comprising:
-
a receiver configured to receive a speech frame;
for encoding;
a classifier configured to classify said speech frame as a voiced speech frame if said speech frame includes a voiced speech component, said classifier further configured to designate said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, said classifier designates said voiced speech frame as a non- stationary voiced speech frame;
wherein said encoder is further configured to allocate a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
wherein said encoder is further configured to;
determine whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate, use a first frame type to encode said stationary voiced speech frame if said encoding rate is said high encoding rate, and use a third frame type to encode said stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said first frame type allocates 25 bits for filter coefficient indicators, 1 bit for a type indicator, 8 bits for said adaptive codebood index, 120 bits for said fixed codebook indes, 6 bits for an adaptive codebook gain, and 10 bits for a fixed codebook gain. - View Dependent Claims (37, 38)
-
-
39. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a triggering characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the speech signal for a frame associated with the interval, the first encoding scheme having a pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic; and
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 27 bits for filter coefficient indicators, 1 bit for a type indicator, 26 bits for an adaptive codebook index, 88 bits for a fixed codebook index, and 28 bits for an adaptive codebook gain and a fixed codebook gain.
-
-
40. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a triggering characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the speech signal for a frame associated with the interval, the first encoding scheme having a pre-processing procedure for processing the inputted speech signal to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic; and
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 7 bits for an adaptive codebook index, 39 bits for a fixed codebook index, 4 bits for an adaptive codebook gain, and 8 bits for a fixed codebook gain. - View Dependent Claims (41)
-
-
42. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a triggering characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the input speech signal for a frame associated with the interval, the first encoding scheme having a pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic; and
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 14 bits for an adaptive codebook index, 30 bits for a fixed codebook index, 14 bits for an adaptive codebook gain and a fixed codebook gain.
-
-
43. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a generally voiced and generally stationary characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the input speech signal for a frame associated with the interval, the second encoding scheme having long-term prediction procedure for processing the inputted speech signal on a sub-frame-by-subframe basis;
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon said detection or absence of the generally voiced and generally stationary characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 27 bits for filter coefficient indicators, 1 bit for a type indicator, 26 bits for an adaptive codebook index, 88 bits for a fixed codebook index, and 28 bits for an adaptive codebook gain and a fixed codebook gain.
-
-
44. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a generally voiced and generally stationary characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the input speech signal for a frame associated with the interval, the second encoding scheme having long-term prediction procedure for processing the input speech signal on a sub-frame-by-subframe basis;
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon said detection or absence of the generally voiced and generally stationary characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the input speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 7 bits for an adaptive codebook index, 39 bits for a fixed codebook index, 4 bits for an adaptive codebook gain, and 8 bits for a fixed codebook gain. - View Dependent Claims (45)
-
-
46. A speech encoding system comprising:
-
a detector for detecting whether an input speech signal generally has a generally voiced and generally stationary characteristic during an interval;
an encoder supporting at least one of a first encoding scheme and a second encoding scheme applicable to the input speech signal for a frame associated with the interval, the second encoding scheme having long-term prediction procedure for processing the input speech signal on a sub-frame-by-subframe basis;
a selector for selecting one of the first encoding scheme and the second encoding scheme based upon said detection or absence of the generally voiced and generally stationary characteristic in the interval of the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the input speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 14 bits for an adaptive codebook index, 30 bits for said fixed codebook index, 14 bits for an adaptive codebook gain and a fixed codebook gain.
-
-
47. A speech encoding method comprising the steps of:
-
detecting whether an input speech signal has a triggering characteristic during an interval;
selecting one of a first encoding scheme and a second encoding scheme, for application to the input speech signal for a frame associated with the interval, based upon said detection of the triggering characteristic; and
processing the input speech signal in accordance with the first encoding scheme to form a revised speech signal biased toward a generally ideal voiced and stationary characteristic if the triggering characteristic is detected in the input speech signal;
wherein the first encoding scheme uses a first frame type for coding the speech signal at a selected rate and the second encoding scheme uses a second frame type for coding the speech signal at the same selected rate, wherein the second frame type is different from the first frame type;
wherein said first frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 7 bits for an adaptive codebook index, 39 bits for a fixed codebook index, 4 bits for an adaptive codebook gain, and 8 bits for a fixed codebook gain. - View Dependent Claims (48)
-
-
49. A speech encoding method comprising:
-
receiving a speech frame for encoding;
classifying said speech frame as a voiced speech frame if said speech frame includes a voiced speech component;
designating said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, designating said voiced speech frame as a non- stationary voiced speech frame; and
allocating a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non-stationary voiced speech frame;
allocating a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
determining whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate;
using a first frame type to encode said stationary voiced speech frame if said encoding rate is said high encoding rate;
using a third frame type to encode said stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said third frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 7 bits for said adaptive codebook index, 39 bits for said fixed codebook index, 4 bits for an adaptive codebook gain, and 8 bits for a fixed codebook gain.
-
-
50. A speech encoding method comprising:
-
receiving a speech frame for encoding;
classifying said speech frame as a voiced speech frame if said speech frame includes a voiced speech component;
designating said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, designating said voiced speech frame as a non- stationary voiced speech frame; and
allocating a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non-stationary voiced speech frame;
allocating a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
determining whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate;
using a second frame type to encode said non-stationary voiced speech frame if said encoding rate is said high encoding rate;
using a fourth frame type to encode said non-stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said second frame type allocates 27 bits for filter coefficient indicators, 1 bit for a type indicator, 26 bits for said adaptive codebook index, 88 bits for said fixed codebook index, and 28 bits for an adaptive codebook gain and a fixed codebook gain. - View Dependent Claims (51)
-
-
52. A speech encoding method comprising:
-
receiving a speech frame for encoding;
classifying said speech frame as a voiced speech frame if said speech frame includes a voiced speech component;
designating said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, designating said voiced speech frame as a non-stationary voiced speech frame; and
allocating a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non-stationary voiced speech frame;
allocating a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
determining whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate;
using a second frame type to encode said non-stationary voiced speech frame if said encoding rate is said high encoding rate;
using a fourth frame type to encode said non-stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said fourth frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 14 bits for said adaptive codebook index, 30 bits for said fixed codebook index, 14 bits for an adaptive codebook gain and a fixed codebook gain.
-
-
53. A speech encoding system comprising:
-
a receiver configured to receive a speech frame for encoding;
a classifier configured to classify said speech frame as a voiced speech frame if said speech frame includes a voiced speech component, said classifier further configured to designate said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, said classifier designates said voiced speech frame as a non-stationary voiced speech frame; and
an encoder configured to allocate a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non-stationary voiced speech frame;
wherein said encoder is further configured to allocate a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
wherein said encoder is further configured to;
determine whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate, use a first frame type to encode said stationary voiced speech frame if said encoding rate is said high encoding rate, and use a third frame type to encode said stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said third frame type allocates 21 bits for filter coefficient indicators, 1 bit for a type indicator, 7 bits for said adaptive codebook index, 39 bits for said fixed codebook index, 4 bits for an adaptive codebook gain, and 8 bits for a fixed codebook gain.
-
-
54. A speech encoding system comprising:
-
a receiver configured to receive a speech frame for encoding;
a classifier configured to classify said speech frame as a voice speech frame if said speech frame includes a voiced speech component, said classifier further configured to designate said voiced speech frame as a stationary voiced speech frame if said voice speech frame is generally stationary, otherwise, said classifier designates said voiced speech fame as a non- stationary voiced speech frame; and
an encoder configured to allocate a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non- stationary voiced speech frame;
wherein said encoder is further configured to allocate a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary voiced speech frame;
wherein said encoder is further configured to;
determine whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate, use a second frame type to encoding said non-stationary voice speech frame if said encoding rate is said high encoding rate, and use a fourth frame type to encode said non-stationary voice speech frame if said encoding rate is said low encoding rate;
wherein said second frame type allocates 27 bits for filter coefficient indicators, 1 bit for a type indicator, 26 bits for said adaptive codebook index, 88 bits for said fixed codebook index, and 28 bits for an adaptive codebook gain and a fixed codebook gain. - View Dependent Claims (55)
-
-
56. A speech encoding system comprising:
-
a receiver configured to receive a speech frame for encoding;
a classifier configured to classify said speech frame as a voiced speech frame if said speech frame includes a voiced speech component, said classifier further configured to designate said voiced speech frame as a stationary voiced speech frame if said voiced speech frame is generally stationary, otherwise, said classifier designates said voiced speech frame as a non- stationary voice speech frame; and
an encoder configured to allocate a lesser number of bits for an adaptive codebook index of said stationary voiced speech frame than for an adaptive codebook index of said non- stationary voiced speech frame;
wherein said encoder is further configured to allocate a greater number of bits for a fixed codebook index of said stationary voiced speech frame than for a fixed codebook index of said non-stationary speech frame;
wherein said encoder is further configured to;
determine whether an encoding rate for encoding said speech frame is a high encoding rate or a low encoding rate, use a second frame type to encode said non-stationary voice speech frame if said encoding rate is said high encoding rate, and use a fourth frame type to encode said non-stationary voiced speech frame if said encoding rate is said low encoding rate;
wherein said fourth frame type allocates 21 bits filter coefficient indicators, 1 bit for a type indicator, 14 bits for said adaptive codebook index, 30 bits for fixed codebook index, 14 bits for an adaptive codebook gain and a fixed codebook gain.
-
Specification