Voice activity detector for audio signals
First Claim
1. A method for determining voice activity in an audio signal, the method comprising:
- receiving a frame of an input audio signal, the input audio signal having an sample rate;
dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband;
filtering the lowest subband with a linear filter to reduce an energy of the lowest subband;
estimating a noise level for at least some of the plurality of subbands;
calculating a signal to noise ratio value for at least some of the plurality of subbands; and
determining a speech activity level based at least in part on an average of the calculated signal to noise ratio values and an average of an energy of at least some of the plurality of subbands,wherein the method is performed with one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.
117 Citations
18 Claims
-
1. A method for determining voice activity in an audio signal, the method comprising:
-
receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a linear filter to reduce an energy of the lowest subband; estimating a noise level for at least some of the plurality of subbands; calculating a signal to noise ratio value for at least some of the plurality of subbands; and determining a speech activity level based at least in part on an average of the calculated signal to noise ratio values and an average of an energy of at least some of the plurality of subbands, wherein the method is performed with one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A voice activity detector, comprising:
-
an input interface that receives a frame of an input audio signal, the input audio signal having an sample rate; one or more filterbanks that divide the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; a linear filter that filters the lowest subband to reduce an energy of the lowest subband; a noise level estimator that estimates a noise level for at least some of the plurality of subbands; a signal to noise ratio calculator for determining a signal to noise ratio value for at least some of the plurality of subbands; and a speech activity level determinator that determines a speech activity level based on an average of the calculated signal to noise ratio values and an average of an energy of at least some of the plurality of subbands, wherein the voice activity detector is implemented with one or more processors. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification