Voice activity detector for audio signals

US 10,586,557 B2
Filed: 07/19/2019
Issued: 03/10/2020
Est. Priority Date: 02/26/2007
Status: Active Grant

First Claim

Patent Images

1. A method for determining voice activity in an audio signal, the method comprising:

receiving a frame of an input audio signal, the input audio signal having a sample rate;

spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband;

filtering the lowest subband to reduce an energy of the lowest subband;

estimating a noise level for at least some of the plurality of subbands;

computing a signal-to-noise ratio for at least some of the plurality of subbands; and

determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands,wherein the method is performed in an audio encoder with one or more processors.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

According to one aspect, a method for determining voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having a sample rate, and spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband. The method further comprises filtering the lowest subband to reduce an energy of the lowest subband, estimating a noise level for at least some of the plurality of subbands, and computing a signal-to-noise ratio for at least some of the plurality of subbands. The method also includes determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands.

115 Citations

6 Claims

1. A method for determining voice activity in an audio signal, the method comprising:
- receiving a frame of an input audio signal, the input audio signal having a sample rate;
  
  spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband;
  
  filtering the lowest subband to reduce an energy of the lowest subband;
  
  estimating a noise level for at least some of the plurality of subbands;
  
  computing a signal-to-noise ratio for at least some of the plurality of subbands; and
  
  determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands,wherein the method is performed in an audio encoder with one or more processors.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 further comprising smoothing the computed signal-to-noise ratios over time to create temporally smoothed subband signal-to-noise ratios.
  - 3. The method of claim 1 further comprising determining a weighted average of the computed signal-to-noise ratios as a spectral tilt of the frame.
  - 4. The method of claim 1, wherein the signal-to-noise ratio is computed as a logarithm of a ratio of an energy-to-noise level.
  - 5. An audio processing apparatus for decoding an encoded audio signal, wherein the audio processing apparatus comprises a demultiplexer for unpacking the encoded audio signal and an audio decoder for decoding the encoded audio signal, wherein the encoded audio signal was generated using at least in part the method of claim 1.
  - 6. A non-transitory computer readable medium comprising instructions that when executed by a processor of an audio processing device cause the audio processing device to perform the method of claim 1.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Muesch, Hannes
Primary Examiner(s)
Godbold, Douglas

Application Number

US16/516,634
Publication Number

US 20190341069A1
Time in Patent Office

235 Days
Field of Search
US Class Current
CPC Class Codes

G10L 19/012   Comfort noise or silence co...

G10L 19/018   Audio watermarking, i.e. em...

G10L 2025/932   Decision in previous or fol...

G10L 2025/937   Signal energy in various fr...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0364   for improving intelligibility

G10L 25/78   Detection of presence or ab...

G10L 25/93   Discriminating between voic...

Voice activity detector for audio signals

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

115 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Voice activity detector for audio signals

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

115 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links