Voice-activity detection using energy ratios and periodicity
First Claim
1. A method of voice activity detection comprising:
- receiving a communications signal comprising multiple frequencies;
processing the signals to determine a difference between (a) an average ratio of energy above a first threshold frequency in the signal and energy below the first threshold frequency in the signal and (b) a present ratio of energy above the first threshold frequency in the signal and energy below the first threshold frequency in the signal; and
in response to the difference being exceeded by a first threshold value, indicating that the signal includes a voice signal; and
in response to the difference exceeding a second threshold value greater than the first threshold value, indicating that the signal includes a voice signal.
13 Assignments
0 Petitions
Accused Products
Abstract
A voice activity detector (100) filters (204) out noise energy and then computes a high-frequency (2400 Hz to 4000 Hz) versus low-frequency (100 Hz to 2400 Hz) signal energy ratio (224), total voiceband (100 Hz to 4000 Hz) signal energy (214), and signal periodicity (208) on successive frames of signal samples. Signal periodicity is determined by estimating the pitch period (206) of the signal, determining a gain value of the signal over the pitch period as a function of the estimated pitch period, and estimating a periodicity of the signal over the pitch period as a function of the estimated pitch period and the gain value. Voice is detected (230–232) in a segment if either (a) the difference between the average high-frequency versus low-frequency signal energy ratio and the present segment'"'"'s high-frequency versus low-frequency energy ratio either exceeds (310) a high threshold value or is exceeded (312) by a low threshold value, or (b) the average periodicity of the signal is lower (306) than a low threshold value, or (c) the difference between the average total signal energy and the present segment'"'"'s total energy exceeds (304) a threshold value and the average periodicity of the signal is lower (304) than a high threshold value, or (d) the average total signal energy exceeds (412) a minimum average total signal energy by a threshold value and voice has been detected (410) in the preceding segment.
60 Citations
45 Claims
-
1. A method of voice activity detection comprising:
-
receiving a communications signal comprising multiple frequencies; processing the signals to determine a difference between (a) an average ratio of energy above a first threshold frequency in the signal and energy below the first threshold frequency in the signal and (b) a present ratio of energy above the first threshold frequency in the signal and energy below the first threshold frequency in the signal; and in response to the difference being exceeded by a first threshold value, indicating that the signal includes a voice signal; and in response to the difference exceeding a second threshold value greater than the first threshold value, indicating that the signal includes a voice signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. An apparatus for detecting voice activity comprising:
-
means for determining an average ratio of energy above a first threshold frequency in a signal comprising multiple frequencies and energy below the first threshold frequency in the signal; means for determining a present ratio of energy above the first threshold frequency in the signal and energy below the first threshold frequency in the signal; means for determining a difference between the average ratio and the present ratio; and means cooperative with the means for determining a difference and responsive to the difference being exceeded by a first threshold value, for indicating that the signal includes a voice signal, and further responsive to the difference exceeding a second threshold value greater than the first threshold value, for indicating that the signal includes a voice signal. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A computer-readable medium containing executable instructions which, when executed in a computer, cause the computer to perform the steps of:
-
determining a difference between (a) an average ratio of energy above a first threshold frequency in a signal comprising multiple frequencies and energy below the first threshold frequency in the signal and (b) a present ratio of energy above the first threshold frequency in the signal and energy below the first threshold frequency in the signal; and in response to the difference being exceeded by a first threshold value, indicating that the signal includes a voice signal; and in response to the difference exceeding a second threshold value greater than the first threshold value, indicating that the signal includes a voice signal. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
Specification