Optimization of network microphone devices using noise classification

US 10,602,268 B1
Filed: 12/20/2018
Issued: 03/24/2020
Est. Priority Date: 12/20/2018
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

detecting sound via one or more microphones of a network microphone device (NMD);

capturing sound data in a first buffer of the NMD based on the detected sound;

analyzing, via the NMD, the sound data to detect a trigger event;

transmitting, via the NMD, the sound data to a first one or more remote computing devices associated with a voice assistant service (VAS);

capturing metadata associated with the sound data in at least a second buffer of the NMD, wherein the sound data is not derivable from the metadata;

transmitting, via the NMD, the metadata absent the sound data to a second one or more remote computing devices associated with a remote evaluator, the remote evaluator being distinct from the VAS;

after detecting the trigger event, analyzing the metadata to classify noise in the sound data; and

based on the classified noise, modifying at least one performance parameter of the NMD.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for optimizing network microphone devices using noise classification are disclosed herein. In one example, individual microphones of a network microphone device (NMD) detect sound. The sound data is analyzed to detect a trigger event such as a wake word. Metadata associated with the sound data is captured in a lookback buffer of the NMD. After detecting the trigger event, the metadata is analyzed to classify noise in the sound data. Based on the classified noise, at least one performance parameter of the NMD is modified.

Citations

18 Claims

1. A method comprising:
- detecting sound via one or more microphones of a network microphone device (NMD);
  
  capturing sound data in a first buffer of the NMD based on the detected sound;
  
  analyzing, via the NMD, the sound data to detect a trigger event;
  
  transmitting, via the NMD, the sound data to a first one or more remote computing devices associated with a voice assistant service (VAS);
  
  capturing metadata associated with the sound data in at least a second buffer of the NMD, wherein the sound data is not derivable from the metadata;
  
  transmitting, via the NMD, the metadata absent the sound data to a second one or more remote computing devices associated with a remote evaluator, the remote evaluator being distinct from the VAS;
  
  after detecting the trigger event, analyzing the metadata to classify noise in the sound data; and
  
  based on the classified noise, modifying at least one performance parameter of the NMD.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein analyzing the metadata to classify noise in the sound data comprises comparing the metadata to reference metadata associated with known noise events.
  - 3. The method of claim 2, wherein the metadata comprises a frequency response spectrum, and wherein comparing the metadata to reference metadata comprises projecting the frequency response spectrum onto an eigenspace corresponding to aggregated frequency response spectra from a population of NMDs.
  - 4. The method of claim 1, wherein modifying the at least one performance parameter of the NMD comprises at least one of:
    - adjusting a wake-word-detection sensitivity parameter of the NMD;
      
      adjusting a playback volume of a playback device associated with the NMD;
      
      ormodifying a noise-cancellation algorithm of the NMD.
  - 5. The method of claim 1, further comprising transmitting, via the NMD, data corresponding to the classified noise to one or more remote computing devices over a wide area network.
  - 6. The method of claim 1, wherein the metadata comprises at least one of:
    - microphone frequency response data;
      
      microphone spectral data;
      
      acoustic echo cancellation (AEC) data;
      
      echo return loss enhancement (ERLE) data;
      
      arbitration data;
      
      signal level data;
      
      or direction detection data.

7. A network microphone device (NMD), comprising:
- one or more processors;
  
  one or more microphones; and
  
  a tangible, non-transitory, computer-readable medium storing instructions executable by the one or more processors to cause the NMD to perform operations comprising;
  
  detecting sound via the one or more microphones;
  
  capturing sound data in a first buffer of the NMD based on the detected sound;
  
  analyzing, via the NMD, the sound data to detect a trigger event;
  
  transmitting, via the NMD, the sound data to a first one or more remote computing devices associated with a voice assistant service (VAS);
  
  capturing metadata associated with the sound data in at least a second buffer of the NMD, wherein the sound data is not derivable from the metadata;
  
  transmitting, via the NMD, the metadata absent the sound data to a second one or more remote computing devices associated with a remote evaluator, the remote evaluator being distinct from the VAS;
  
  after detecting the trigger event, analyzing the metadata to classify noise in the sound data; and
  
  based on the classified noise, modifying at least one performance parameter of the NMD.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The NMD of claim 7, wherein analyzing the metadata to classify noise in the sound data comprises comparing the metadata to reference metadata associated with known noise events.
  - 9. The NMD of claim 8, wherein the metadata comprises a frequency response spectrum, and wherein comparing the metadata to reference metadata comprises projecting the frequency response spectrum onto an eigenspace corresponding to aggregated frequency response spectra from a population of NMDs.
  - 10. The NMD of claim 7, wherein modifying the at least one performance parameter of the NMD comprises at least one of:
    - adjusting a wake-word-detection sensitivity parameter of the NMD;
      
      adjusting a playback volume of a playback device associated with the NMD;
      
      ormodifying a noise-cancellation algorithm of the NMD.
  - 11. The NMD of claim 7, wherein the operations further comprise transmitting, via the NMD, data corresponding to the classified noise to one or more remote computing devices over a wide area network.
  - 12. The NMD of claim 7, wherein the metadata comprises at least one of:
    - microphone frequency response data;
      
      microphone spectral data;
      
      acoustic echo cancellation (AEC) data;
      
      echo return loss enhancement (ERLE) data;
      
      arbitration data;
      
      signal level data;
      
      or direction detection data.

13. Tangible, non-transitory, computer-readable medium storing instructions executable by one or more processors to cause a network microphone device (NMD) to perform operations comprising:
- detecting sound via one or more microphones of the NMD;
  
  capturing sound data in a first buffer of the NMD based on the detected sound;
  
  analyzing, via the NMD, the sound data to detect a trigger event;
  
  transmitting, via the NMD, the sound data to a first one or more remote computing devices associated with a voice assistant service (VAS);
  
  capturing metadata associated with the sound data in at least a second buffer of the NMD, wherein the sound data is not derivable from the metadata;
  
  transmitting, via the NMD, the metadata absent the sound data to a second one or more remote computing devices associated with a remote evaluator, the remote evaluator being distinct from the VAS;
  
  after detecting the trigger event, analyzing the metadata to classify noise in the sound data; and
  
  based on the classified noise, modifying at least one performance parameter of the NMD.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The tangible, non-transitory, computer-readable medium of claim 13, wherein analyzing the metadata to classify noise in the sound data comprises comparing the metadata to reference metadata associated with known noise events.
  - 15. The tangible, non-transitory, computer-readable medium of claim 14, wherein the metadata comprises a frequency response spectrum, and wherein comparing the metadata to reference metadata comprises projecting the frequency response spectrum onto an eigenspace corresponding to aggregated frequency response spectra from a population of NMDs.
  - 16. The tangible, non-transitory, computer-readable medium of claim 13, wherein modifying the at least one performance parameter of the NMD comprises at least one of:
    - adjusting a wake-word-detection sensitivity parameter of the NMD;
      
      adjusting a playback volume of a playback device associated with the NMD;
      
      ormodifying a noise-cancellation algorithm of the NMD.
  - 17. The tangible, non-transitory, computer-readable medium of claim 14, wherein the operations further comprise transmitting, via the NMD, data corresponding to the classified noise to one or more remote computing devices over a wide area network.
  - 18. The tangible, non-transitory, computer-readable medium of claim 13, wherein the metadata comprises at least one of:
    - microphone frequency response data;
      
      microphone spectral data;
      
      acoustic echo cancellation (AEC) data;
      
      echo return loss enhancement (ERLE) data;
      
      arbitration data;
      
      signal level data;
      
      or direction detection data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sonos, Inc.
Original Assignee
Sonos, Inc.
Inventors
Soto, Kurt Thomas
Primary Examiner(s)
Mooney, James K

Application Number

US16/227,308
Time in Patent Office

460 Days
Field of Search

None
US Class Current
CPC Class Codes

G10K 11/178   by electro-acoustically reg...

G10L 15/08   Speech classification or se...

G10L 2015/088   Word spotting

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0208   Noise filtering

G10L 25/18   the extracted parameters be...

G10L 25/27   characterised by the analys...

G10L 25/51   for comparison or discrimin...

G10L 25/84   for discriminating voice fr...

H04R 1/406   microphones

H04R 2227/001   Adaptation of signal proces...

H04R 2227/005   Audio distribution systems ...

H04R 27/00   Public address systems circ...

H04R 29/005   Microphone arrays

H04R 3/005   for combining the signals o...

Optimization of network microphone devices using noise classification

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Optimization of network microphone devices using noise classification

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links