Playback device supporting concurrent voice assistant services

US 10,565,999 B2
Filed: 06/11/2019
Issued: 02/18/2020
Est. Priority Date: 08/05/2016
Status: Active Grant

First Claim

Patent Images

1. A playback device comprising:

one or more amplifiers configured to drive one or more speakers;

at least one microphone;

a network interface;

one or more processors; and

data storage having stored therein instructions executable by the one or more processors to cause the playback device to perform a method comprising;

continuously capturing, via the at least one microphone, audio into one or more buffers;

analyzing the captured audio using a first wake-word detection algorithm and a second wake-word detection algorithm, wherein the first wake-word detection algorithm corresponds to a first voice assistant service associated with a first wake word, and wherein the second wake-word detection algorithm corresponds to a second voice assistant service associated with a second wake word;

when one of the first wake-word detection algorithm and the second wake-word detection algorithm detects, in the captured audio, a wake word corresponding to a particular voice assistant service of (a) the first voice assistant service or (b) the second voice assistant service, transmitting the captured audio to one or more servers associated with the particular voice assistant service for processing voice input in the captured audio;

after transmitting the captured audio, receiving, via the network interface, at least one instruction, wherein the at least one instruction is based on the voice input in the captured audio; and

outputting audio based on the at least one instruction via the one or more amplifiers configured to drive the one or more speakers.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are example techniques to support multiple voice assistant services. An example implementation may involve a playback device continuously capturing, via the at least one microphone, audio into one or more buffers and analyzing the captured audio using a first wake-word detection algorithm and a second wake-word detection algorithm. When one of the first wake-word detection algorithm or the second wake-word detection algorithm detects, in the captured audio, a wake-word corresponding to a particular voice assistant service of (a) the first voice assistant service or (b) the second voice assistant service, the playback device transmits the captured audio to one or more servers associated with the particular voice assistant service. After transmitting the captured audio, the playback device receives, via the network interface, at least one instruction based on the captured audio; and performs one or more actions based on the at least one instruction.

Citations

20 Claims

1. A playback device comprising:
- one or more amplifiers configured to drive one or more speakers;
  
  at least one microphone;
  
  a network interface;
  
  one or more processors; and
  
  data storage having stored therein instructions executable by the one or more processors to cause the playback device to perform a method comprising;
  
  continuously capturing, via the at least one microphone, audio into one or more buffers;
  
  analyzing the captured audio using a first wake-word detection algorithm and a second wake-word detection algorithm, wherein the first wake-word detection algorithm corresponds to a first voice assistant service associated with a first wake word, and wherein the second wake-word detection algorithm corresponds to a second voice assistant service associated with a second wake word;
  
  when one of the first wake-word detection algorithm and the second wake-word detection algorithm detects, in the captured audio, a wake word corresponding to a particular voice assistant service of (a) the first voice assistant service or (b) the second voice assistant service, transmitting the captured audio to one or more servers associated with the particular voice assistant service for processing voice input in the captured audio;
  
  after transmitting the captured audio, receiving, via the network interface, at least one instruction, wherein the at least one instruction is based on the voice input in the captured audio; and
  
  outputting audio based on the at least one instruction via the one or more amplifiers configured to drive the one or more speakers.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The playback device of claim 1, wherein the at least one instruction includes an instruction to play back at least one audio track, and wherein outputting the audio comprises playing back the at least one audio track via the one or more speakers configured to drive the one or more speakers.
  - 3. The playback device of claim 2, wherein transmitting the captured audio to one or more servers associated with the particular voice assistant service comprises transmitting the captured audio to the first voice assistant service based on detection of the first wake word in the captured audio, and wherein the method further comprises:
    - further capturing, via the at least one microphone, audio into the one or more buffers;
      
      analyzing the further captured audio using the first wake-word detection algorithm and the second wake-word detection algorithm;
      
      detecting, in the further captured audio, the second wake word via the second wake-word detection algorithm;
      
      after detecting the second wake word, transmitting the further captured audio to one or more servers associated with the second voice assistant service;
      
      after transmitting the further captured audio, receiving from the second voice assistant service, via the network interface, at least one instruction based on the further captured audio; and
      
      after receiving the at least one instruction from the second voice assistant service, performing one or more actions based on the at least one instruction from the second voice assistant service.
  - 4. The playback device of claim 3, wherein performing the one or more actions includes modifying at least one playback setting of a media playback system, and wherein the media playback system comprises the playback device.
  - 5. The playback device of claim 3, wherein the further captured audio comprises a query, and wherein performing the one or more actions includes playing back audio corresponding to results of the query via the one or more amplifiers configured to drive the one or more speakers.
  - 6. The playback device of claim 1, wherein the method further comprises:
    - detecting the second wake word in the captured audio via the second wake-word detection algorithm;
      
      after detecting the second wake word, determining that the second voice assistance service is unavailable to process the captured audio; and
      
      in response to determining that the second voice assistant service is unavailable to process the captured audio, transmitting at least a portion of the captured audio to one or more remote servers associated with the first voice assistant service.
  - 7. The playback device of claim 6, wherein the method further comprises:
    - assigning the first voice assistant service as a default voice assistant service;
      
      further capturing, via the at least one microphone, audio into the one or more buffers;
      
      analyzing the further captured audio using the first wake-word detection algorithm and the second wake-word detection algorithm;
      
      detecting, in the further captured audio, the first wake word via the first wake-word detection algorithm;
      
      determining that the default voice assistance service is unavailable to process the further captured audio; and
      
      in response to determining that the default voice assistant service is unavailable to process the captured audio, foregoing transmission of the further captured audio to any voice assistant service.

8. A method to be performed by a playback device comprising a network interface, at least one microphone, and one or more amplifiers configured to drive one or more speakers, the method comprising:
- continuously capturing, via the at least one microphone, audio into one or more buffers;
  
  analyzing the captured audio using a first wake-word detection algorithm and a second wake-word detection algorithm, wherein the first wake-word detection algorithm corresponds to a first voice assistant service associated with a first wake word and wherein the second wake-word detection algorithm corresponds to a second voice assistant service associated with a second wake word;
  
  when one of the first wake-word detection algorithm and the second wake-word detection algorithm detects, in the captured audio, a wake word corresponding to a particular voice assistant service of (a) the first voice assistant service or (b) the second voice assistant service, transmitting the captured audio to one or more servers associated with the particular voice assistant service for processing voice input in the captured audio;
  
  after transmitting the captured audio, receiving, via the network interface, at least one instruction, wherein the at least one instruction is based on the voice input in the captured audio; and
  
  outputting audio based on the at least one instruction via the one or more amplifiers configured to drive the one or more speakers.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method of claim 8, wherein the at least one instruction includes an instruction to play back at least one audio track, and wherein outputting the audio comprises playing back the at least one audio track via the one or more speakers configured to drive the one or more speakers.
  - 10. The method of claim 9, wherein transmitting the captured audio to one or more servers associated with the particular voice assistant service comprises transmitting the captured audio to the first voice assistant service based on detection of the first wake word in the captured audio, and wherein the method further comprises:
    - further capturing, via the at least one microphone, audio into the one or more buffers;
      
      analyzing the further captured audio using the first wake-word detection algorithm and the second wake-word detection algorithm;
      
      detecting, in the further captured audio, the second wake word via the second wake-word detection algorithm;
      
      after detecting the second wake word, transmitting the further captured audio to one or more servers associated with the second voice assistant service;
      
      after transmitting the further captured audio, receiving from the second voice assistant service, via the network interface, at least one instruction based on the further captured audio; and
      
      after receiving the at least one instruction from the second voice assistant service, performing one or more actions based on the at least one instruction from the second voice assistant service.
  - 11. The method of claim 10, wherein performing the one or more actions includes modifying at least one playback setting of a media playback system, and wherein the media playback system comprises the playback device.
  - 12. The method of claim 10, wherein the further captured audio comprises a query, and wherein performing the one or more actions includes playing back audio corresponding to results of the query via the one or more amplifiers configured to drive the one or more speakers.
  - 13. The method of claim 8, further comprising:
    - detecting the second wake word in the captured audio via the second wake-word detection algorithm;
      
      after detecting the second wake word, determining that the second voice assistance service is unavailable to process the captured audio; and
      
      in response to determining that the second voice assistant service is unavailable to process the captured audio, transmitting at least a portion of the captured audio to one or more remote servers associated with the first voice assistant service.
  - 14. The playback device of claim 13, wherein the method further comprises:
    - assigning the first voice assistant service as a default voice assistant service;
      
      further capturing, via the at least one microphone, audio into the one or more buffers;
      
      analyzing the further captured audio using the first wake-word detection algorithm and the second wake-word detection algorithm;
      
      detecting, in the further captured audio, the first wake word via the first wake-word detection algorithm;
      
      determining that the default voice assistance service is unavailable to process the further captured audio; and
      
      in response to determining that the default voice assistant service is unavailable to process the captured audio, foregoing transmission of the further captured audio to any voice assistant service.

15. A non-transitory computer-readable medium having instructions stored thereon that are executable by one or more processors to cause a playback device to perform a method, the playback device comprising a network interface, at least one microphone, and one or more amplifiers configured to drive one or more speakers, the method comprising:
- continuously capturing, via the at least one microphone, audio into one or more buffers;
  
  analyzing the captured audio using a first wake-word detection algorithm and a second wake-word detection algorithm, wherein the first wake-word detection algorithm corresponds to a first voice assistant service associated with a first wake word, and wherein the second wake-word detection algorithm corresponds to a second voice assistant service associated with a second wake word;
  
  when one of the first wake-word detection algorithm and the second wake-word detection algorithm detects, in the captured audio, a wake word corresponding to a particular voice assistant service of (a) the first voice assistant service or (b) the second voice assistant service, transmitting the captured audio to one or more servers associated with the particular voice assistant service for processing voice input in the captured audio;
  
  after transmitting the captured audio, receiving, via the network interface, at least one instruction, wherein the at least one instruction is based on the voice input in the captured audio; and
  
  outputting audio based on the at least one instruction via the one or more amplifiers configured to drive the one or more speakers.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The non-transitory computer-readable medium of claim 15, wherein the at least one instruction includes an instruction to play back at least one audio track, and wherein outputting the audio comprises playing back the at least one audio track via the one or more speakers configured to drive the one or more speakers.
  - 17. The non-transitory computer-readable medium of claim 16, wherein transmitting the captured audio to one or more servers associated with the particular voice assistant service comprises transmitting the captured audio to the first voice assistant service based on detection of the first wake word in the captured audio, and wherein the method further comprises:
    - further capturing, via the at least one microphone, audio into the one or more buffers;
      
      analyzing the further captured audio using the first wake-word detection algorithm and the second wake-word detection algorithm;
      
      detecting, in the further captured audio, the second wake word via the second wake-word detection algorithm;
      
      after detecting the second wake word, transmitting the further captured audio to one or more servers associated with the second voice assistant service;
      
      after transmitting the further captured audio, receiving from the second voice assistant service, via the network interface, at least one instruction based on the further captured audio; and
      
      after receiving the at least one instruction from the second voice assistant service, performing one or more actions based on the at least one instruction from the second voice assistant service.
  - 18. The non-transitory computer-readable medium of claim 17, wherein performing the one or more actions includes modifying at least one playback setting of a media playback system, and wherein the media playback system comprises the playback device.
  - 19. The non-transitory computer-readable medium of claim 17, wherein the further captured audio comprises a query, and wherein performing the one or more actions includes playing back audio corresponding to results of the query via the one or more amplifiers configured to drive the one or more speakers.
  - 20. The non-transitory computer-readable medium of claim 15, wherein the method further comprises:
    - detecting the second wake word in the captured audio via the second wake-word detection algorithm;
      
      after detecting the second wake word, determining that the second voice assistance service is unavailable to process the captured audio; and
      
      in response to determining that the second voice assistant service is unavailable to process the captured audio, transmitting at least a portion of the captured audio to one or more remote servers associated with the first voice assistant service.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sonos, Inc.
Original Assignee
Sonos, Inc.
Inventors
Wilberding, Dayn
Primary Examiner(s)
Leland, III, Edwin S

Application Number

US16/437,476
Publication Number

US 20190295556A1
Time in Patent Office

252 Days
Field of Search

704275
US Class Current
CPC Class Codes

G06F 3/167   Audio in a user interface, ...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 17/02   Preprocessing operations, e...

G10L 17/22   Interactive procedures; Man...

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

H05B 47/165   following a pre-assigned pr...

Playback device supporting concurrent voice assistant services

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Playback device supporting concurrent voice assistant services

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links