Voice activity detector for audio signals

US 10,418,052 B2
Filed: 10/12/2017
Issued: 09/17/2019
Est. Priority Date: 02/26/2007
Status: Active Grant

First Claim

Patent Images

1. A method for determining voice activity in an audio signal, the method comprising:

receiving a frame of an input audio signal, the input audio signal having a sample rate;

spitting the audio signal into a plurality of subbands by way of a sequence of filter banks, the plurality of subbands including at least a lowest subband and a highest subband;

filtering the lowest subband with a linear filter to reduce an energy of the lowest subband;

estimating a noise level for at least some of the plurality of subbands such that in each subband, a noise level estimator tracks the background noise level and a Signal-to-Noise Ratio (SNR) valuecalculating a signal to noise ratio value for at least some of the plurality of subbands; and

determining a speech activity level based at least in part on an average of the calculated signal to noise ratio values and an average of an energy of at least some of the plurality of subbands,wherein the method is performed with one or more computing devices.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.

Citations

4 Claims

1. A method for determining voice activity in an audio signal, the method comprising:
- receiving a frame of an input audio signal, the input audio signal having a sample rate;
  
  spitting the audio signal into a plurality of subbands by way of a sequence of filter banks, the plurality of subbands including at least a lowest subband and a highest subband;
  
  filtering the lowest subband with a linear filter to reduce an energy of the lowest subband;
  
  estimating a noise level for at least some of the plurality of subbands such that in each subband, a noise level estimator tracks the background noise level and a Signal-to-Noise Ratio (SNR) valuecalculating a signal to noise ratio value for at least some of the plurality of subbands; and
  
  determining a speech activity level based at least in part on an average of the calculated signal to noise ratio values and an average of an energy of at least some of the plurality of subbands,wherein the method is performed with one or more computing devices.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1 further comprising smoothing the calculated signal to noise ratio values over time to create temporally smoothed subband signal to noise values.
  - 3. The method of claim 1 further comprising determining a weighted average of the calculated signal to noise ratio values as a spectral tilt of the frame.
  - 4. The method of claim 1, wherein the SNR value is computed as a logarithm of the ratio of energy-to-noise level.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Muesch, Hannes
Primary Examiner(s)
Godbold, Douglas

Application Number

US15/730,908
Publication Number

US 20180033453A1
Time in Patent Office

705 Days
Field of Search
US Class Current
CPC Class Codes

G10L 19/012   Comfort noise or silence co...

G10L 19/018   Audio watermarking, i.e. em...

G10L 2025/932   Decision in previous or fol...

G10L 2025/937   Signal energy in various fr...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0364   for improving intelligibility

G10L 25/78   Detection of presence or ab...

G10L 25/93   Discriminating between voic...

Voice activity detector for audio signals

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

4 Claims

Specification

Solutions

Use Cases

Quick Links

Voice activity detector for audio signals

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

4 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links