Voice activity detector
First Claim
1. A voice activity detector, comprising:
- a) an absolute value squarer, having an input for receiving a signal, and having an output;
b) a low pass filter, having an input connected to the output of said absolute value squarer, and having an output;
c) a mean subtractor, having an input connected to the output of said low pass filter, and having an output;
d) a zero padder, having an input connected to the output of said mean subtractor, and having an output;
e) a Digital Fast Fourier Transformer, having an input connected to the output of said zero padder, and having an output;
f) a normalizer, having an input connected to the output of said Digital fast Fourier Transformer, and having an output; and
g) a classifier, having an input connected to the output of said normalizer, and having an output.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is a device for and method of detecting voice activity by receiving a signal; computing the absolute value of the signal; squaring the absolute value; low pass filtering the squared result; computing the mean of the filtered signal; subtracting the mean from the filtered result; padding the mean subtracted result with zeros to form a value that is a power of two if the result is not already a power of two; computing a DFFT of the power of two result; normalizing the DFFT result of the last step; computing a mean of the normalization; computing a variance of the normalization; computing a power ratio of the normalization; classifying the mean, variance and power ratio as speech or non-speech based on how this feature vector compares to similarly constructed feature vectors of known speech and non-speech. The voice activity detector includes an absolute value squarer; a low pass filter; a mean subtractor; a zero padder; a DFFT; a normalizer; and a classifier.
-
Citations
12 Claims
-
1. A voice activity detector, comprising:
-
a) an absolute value squarer, having an input for receiving a signal, and having an output;
b) a low pass filter, having an input connected to the output of said absolute value squarer, and having an output;
c) a mean subtractor, having an input connected to the output of said low pass filter, and having an output;
d) a zero padder, having an input connected to the output of said mean subtractor, and having an output;
e) a Digital Fast Fourier Transformer, having an input connected to the output of said zero padder, and having an output;
f) a normalizer, having an input connected to the output of said Digital fast Fourier Transformer, and having an output; and
g) a classifier, having an input connected to the output of said normalizer, and having an output.
-
-
2. A voice activity detector, comprising:
-
a) an absolute value squarer, having an input for receiving a signal, and having an output;
b) a low pass filter, having an input connected to the output of said absolute value squarer, and having an output;
c) a threshold-crossing detector, having a user-definable threshold, having an input connected to the output of said low pass filter, having a first output, and having a second output;
d) a mean subtractor, having an input connected to the first output of said zero crossing detector, and having an output;
e) a zero padder, having an input connected to the output of said mean subtractor, and having an output;
f) a Digital Fast Fourier Transformer, having an input connected to the output of said zero padder, and having an output;
g) a normalizer, having an input connected to the output of said Digital Fast Fourier Transformer, and having an output;
h) a classifier, having an input connected to the output of said normalizer, and having an output; and
i) decision logic, having a first input connected to the second output of said zero crossing detector, having a second input connected to the output of said classifier, and having an output.
-
-
3. A method of detecting voice activity, comprising the steps of:
-
a) receiving a signal;
b) computing the absolute value of the signal;
c) squaring the result of the last step;
d) filtering the result of the last step to only pass low frequency components in the range of from 0-60 Hz;
e) computing the mean of the last step;
f) subtracting the mean computed in the last step from the result of step (d);
g) padding the result of the last step with zeros to form the next highest power of two of the result of the last step if the result of the last step is not already a power of two;
e) computing a Digital Fast Fourier Transform of the result of the last step;
f) normalizing the result of the last step;
g) computing a mean of the result of the last step;
h) computing a variance of the result of step (f);
i) computing a power ratio of the result of step (f);
j) classifying the results of step (g), step (h), and step (i) as a type of known speech and known non-speech to which the results of step (g), step (h), and step (i) most closely compares, where the known speech and the known non-speech are each identified by a mean, a variance and a power ratio. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12)
a) retaining a number of consecutive 0.5 second frames; and
b) using the number of consecutive 0.5 second frames as votes to determine whether the 0.1 second interval common to the number of consecutive 0.5 second frames is speech or non-speech.
-
-
6. The method of claim 5, wherein said step of retaining a number of consecutive 0.5 second frames is comprised of the step of retaining five consecutive 0.5 second frames.
-
7. The method of claim 6, wherein said step of classifying the results of step (g), step (h), and step (i) is comprised of performing a Quadratic Discriminant Analysis.
-
8. The method of claim 7, further including counting the number of times the result of filtering crosses a user-definable threshold.
-
9. The method of claim 8, wherein said step of counting the number of zero threshold crossings is comprised of the step of counting the number of times the result of filtering crosses a user-definable threshold, where the threshold is defined as 0.25 times the mean of an AM envelope of the signal.
-
10. The method of claim 3, wherein said step of classifying the results of step (g), step (h), and step (i) is comprised of performing a Quadratic Discriminant Analysis.
-
11. The method of claim 3, further including counting the number of times the result of filtering crosses a user-definable threshold.
-
12. The method of claim 11, wherein said step of counting the number of zero threshold crossings is comprised of the step of counting the number of times the result of filtering crosses a user-definable threshold, where the threshold is defined as 0.25 times the mean of an AM envelope of the signal.
Specification