Voice activity detection process and means for implementing said process
First Claim
Patent Images
1. In a system wherein at least one voice signal provided by a source via an input channel is coded to derive therefrom blocks of samples x.sub.(n) of predetermined duration, and, short term power spectrum information, a voice activity detection process for discriminating active voice blocks from non active voice blocks said process including, for each block of samples the following steps:
- (a) Setting an amplitude threshold VADTH;
(b) Processing the block of x.sub.(n) values to derive therefrom a signal energy representative information XM ;
(c) First comparing XM to VADTH and adjusting said threshold accordingly;
(d) Second comparing XM to k. VADTH, where k is a predetermined numerical value and VADTH is the adjusted threshold, to derive therefrom a channel activity indication when XM is larger than k. VADTH, or an ambiguity indication otherwise, whereby a hangover timer is set upon activity detection or ambiguity resolution operations are to be performed upon ambiguity detection, which ambiguity resolution includes;
decreasing and testing said timer contents whereby a positive timer contents is indicative of an active voice block and a negative timer contents is still indicative of an ambiguity situation;
computing short term power spectrum information variation between the currently processed block and at least one previously processed block ; and
,comparing said short term power spectrum variation with a preset reference level, whereby the currently processed ambiguous block is considered inactive or active based on said comparison indication.
1 Assignment
0 Petitions
Accused Products
Abstract
Speech signal presence is detected in a VAD (Voice Activity Detector) in two steps: (1) Signal energy above a threshold decides presence, below threshold decides ambiguity; (2) ambiguity is resolved by testing the rate of change of spectral parameters.
-
Citations
8 Claims
-
1. In a system wherein at least one voice signal provided by a source via an input channel is coded to derive therefrom blocks of samples x.sub.(n) of predetermined duration, and, short term power spectrum information, a voice activity detection process for discriminating active voice blocks from non active voice blocks said process including, for each block of samples the following steps:
-
(a) Setting an amplitude threshold VADTH; (b) Processing the block of x.sub.(n) values to derive therefrom a signal energy representative information XM ; (c) First comparing XM to VADTH and adjusting said threshold accordingly; (d) Second comparing XM to k. VADTH, where k is a predetermined numerical value and VADTH is the adjusted threshold, to derive therefrom a channel activity indication when XM is larger than k. VADTH, or an ambiguity indication otherwise, whereby a hangover timer is set upon activity detection or ambiguity resolution operations are to be performed upon ambiguity detection, which ambiguity resolution includes; decreasing and testing said timer contents whereby a positive timer contents is indicative of an active voice block and a negative timer contents is still indicative of an ambiguity situation; computing short term power spectrum information variation between the currently processed block and at least one previously processed block ; and
,comparing said short term power spectrum variation with a preset reference level, whereby the currently processed ambiguous block is considered inactive or active based on said comparison indication. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification