Speech encoder using voice activity detection in coding noise
First Claim
1. A speech encoding system using an analysis by synthesis approach on a speech signal having varying characteristics, the speech encoding system comprising:
- an encoder processing circuit that selectively applies a first or a second encoding scheme upon identification of varying characteristics of the speech signal;
where the varying characteristics are utilized to classify the speech signal as having one of active voice content and inactive voice content;
the first encoding scheme utilizes a first analysis-by-synthesis speech coding approach on a speech signal classified as active voice content; and
the second encoding scheme utilizes a second analysis-by-synthesis speech coding approach on a speech signal classified as inactive voice content, the inactive voice content comprising background noise.
13 Assignments
0 Petitions
Accused Products
Abstract
A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech coder distinguishes various voice signals as a function of their voice content. For example, a Voice Activity Detection (VAD) algorithm selects an appropriate coding scheme depending on whether the speech signal comprises active or inactive speech. The encoder may consider varying characteristics of the speech signal including sharpness, a delay correlation, a zero-crossing rate, and a residual energy. In another embodiment of the present invention, code excited linear prediction is used for voice active signals whereas random excitation is used for voice inactive signals; the energy level and spectral content of the voice inactive signal may also be used for noise coding.
-
Citations
20 Claims
-
1. A speech encoding system using an analysis by synthesis approach on a speech signal having varying characteristics, the speech encoding system comprising:
-
an encoder processing circuit that selectively applies a first or a second encoding scheme upon identification of varying characteristics of the speech signal;
where the varying characteristics are utilized to classify the speech signal as having one of active voice content and inactive voice content;
the first encoding scheme utilizes a first analysis-by-synthesis speech coding approach on a speech signal classified as active voice content; and
the second encoding scheme utilizes a second analysis-by-synthesis speech coding approach on a speech signal classified as inactive voice content, the inactive voice content comprising background noise. - View Dependent Claims (2, 3, 4, 5, 11, 12, 13, 14)
-
-
6. A speech encoding system for processing a speech signal having varying characteristics, the speech encoding system comprising:
-
an encoder processing circuit that selectively applies a first or a second analysis-by-synthesis encoding scheme based upon at least one of the varying characteristics of the speech signal;
the encoder processing circuit applies the first analysis-by-synthesis encoding scheme following identification of an active voice frame of the speech signal; and
the encoder processing circuit applies the second analysis-by-synthesis encoding scheme following identification of an inactive voice frame of the speech signal, the inactive voice frame comprising background noise. - View Dependent Claims (7, 8, 9, 10, 15)
-
-
16. A method of encoding a speech signal comprising:
-
classifying the speech signal as having one of active voice content and inactive voice content, the inactive voice content comprising background noise;
applying a first encoding scheme comprising analysis-by-synthesis when the speech signal is classified as having active voice content; and
applying a second encoding scheme comprising analysis-by-synthesis when the speech signal is classified as having inactive voice content. - View Dependent Claims (17, 18, 19, 20)
-
Specification