×

Voice activity decision base on zero crossing rate and spectral sub-band energy

  • US 8,296,133 B2
  • Filed: 11/30/2011
  • Issued: 10/23/2012
  • Est. Priority Date: 10/15/2009
  • Status: Active Grant
First Claim
Patent Images

1. A voice activity detection method, comprising:

  • obtaining a time domain parameter and a frequency domain parameter from a current audio frame to be detected;

    obtaining a first distance between the time domain parameter and a long-term sliding mean of the time domain parameter in a history background noise frame;

    obtaining a second distance between the frequency domain parameter and a long-term sliding mean of the frequency domain parameter in the history background noise frame; and

    judging whether the current audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance, and a set of decision inequalities based on the first distance and the second distance,wherein at least one coefficient in the set of decision inequalities is a variable determined according to a voice activity detection operation mode or features of an input signal,wherein the frequency domain parameter indicates spectral sub-band energy, and wherein the second distance between the frequency domain parameter and the long-term sliding mean of the frequency domain parameter in the history background noise frame is a signal-to-noise ratio of the audio frame,wherein obtaining the signal-to-noise ratio of the audio frame comprises;

    obtaining a signal-to-noise ratio of each sub-band according to a ratio of the spectral sub-band energy to the long-term sliding mean of the spectral sub-band energy in the history background noise frame;

    performing linear processing or nonlinear processing on the signal-to-noise ratio of each sub-band; and

    summing the signal-to-noise ratio of each sub-band after the processing to obtain the signal-to-noise ratio of the audio frame, wherein performing the nonlinear processing on the signal-to-noise ratio of each sub-band comprises determining the signal-to-noise ratio of each sub-band after the nonlinear processing according to

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×