Attribute-based audio channel arbitration

US 10,055,190 B2
Filed: 12/16/2013
Issued: 08/21/2018
Est. Priority Date: 12/16/2013
Status: Active Grant

First Claim

Patent Images

1. A device, comprising:

one or more processors;

a speaker;

a microphone;

a network communications interface configured to communicate with a remote, network-based speech command service; and

non-transitory computer-readable media storing computer executable instructions that, when executed by the one or more processors, cause the one or more processors to perform acts comprising;

receiving a first command from the network-based speech command service to play audio content;

receiving, from the network-based speech command service, a mixing attribute specified by the network-based speech command service and indicating whether to mix the audio content with additional audio content, the mixing attribute comprising;

first data specifying that the audio content is not to be played by the speaker at a same time as the additional audio content;

orsecond data specifying that the audio content is to be played by the speaker at the same time as the additional audio content;

playing the audio content on the speaker at a first volume;

receiving a second command from the network-based speech command service to play a speech message, wherein the speech message corresponds to a type of the additional audio content;

receiving an interrupt attribute from the network-based speech command service, wherein the interrupt attribute corresponds to the speech message, wherein the interrupt attribute is specified by the network-based speech command service and specifies whether the audio content is to be attenuated or paused while the speech message is played, the interrupt attribute comprising;

third data specifying that the audio content is to be attenuated while the speech message is played;

orfourth data specifying that the audio content is to be paused while the speech message is played;

playing the speech message on the speaker;

based at least in part on the mixing attribute comprising the second data and the interrupt attribute comprising the third data, lowering playback volume of the audio content to a second volume during playing of the speech message; and

based at least in part on one or more of the mixing attribute comprising the first data or the interrupt attribute comprising the fourth data, pausing the playing of the audio content during playing of the speech message.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech-based system includes a local device in a user premises and a remote service that uses the local device to conduct speech dialogs with a user. The local device may also be directed to play audio such as music, audio books, etc. When designating audio for playing by the local device, the remote service may specify that the audio is either background audio or foreground audio. For background audio, the service indicates whether the background audio is mixable. For foreground audio, the service indicates an interrupt behavior. When the local device is playing background audio and receives foreground audio, the background audio is paused, attenuated, or not changed based on the indicated interrupt behavior of the foreground audio and whether the background audio has been designated as being mixable.

Citations

18 Claims

1. A device, comprising:
- one or more processors;
  
  a speaker;
  
  a microphone;
  
  a network communications interface configured to communicate with a remote, network-based speech command service; and
  
  non-transitory computer-readable media storing computer executable instructions that, when executed by the one or more processors, cause the one or more processors to perform acts comprising;
  
  receiving a first command from the network-based speech command service to play audio content;
  
  receiving, from the network-based speech command service, a mixing attribute specified by the network-based speech command service and indicating whether to mix the audio content with additional audio content, the mixing attribute comprising;
  
  first data specifying that the audio content is not to be played by the speaker at a same time as the additional audio content;
  
  orsecond data specifying that the audio content is to be played by the speaker at the same time as the additional audio content;
  
  playing the audio content on the speaker at a first volume;
  
  receiving a second command from the network-based speech command service to play a speech message, wherein the speech message corresponds to a type of the additional audio content;
  
  receiving an interrupt attribute from the network-based speech command service, wherein the interrupt attribute corresponds to the speech message, wherein the interrupt attribute is specified by the network-based speech command service and specifies whether the audio content is to be attenuated or paused while the speech message is played, the interrupt attribute comprising;
  
  third data specifying that the audio content is to be attenuated while the speech message is played;
  
  orfourth data specifying that the audio content is to be paused while the speech message is played;
  
  playing the speech message on the speaker;
  
  based at least in part on the mixing attribute comprising the second data and the interrupt attribute comprising the third data, lowering playback volume of the audio content to a second volume during playing of the speech message; and
  
  based at least in part on one or more of the mixing attribute comprising the first data or the interrupt attribute comprising the fourth data, pausing the playing of the audio content during playing of the speech message.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The device of claim 1, further comprising resuming playing the audio content at the first volume after playing the speech message.
  - 3. The device of claim 1, wherein the device obtains the audio content from a remote source different than the network-based speech command service.
  - 4. The device of claim 1, wherein the second command to play the speech message was received in response to an utterance of a user, wherein the utterance of the user was transmitted to the network-based speech command service.
  - 5. The device of claim 1, wherein:
    - the first data comprises one or more first words which specify that the audio content is to be paused while the speaker plays the additional audio content; and
      
      the second data comprises one or more second words which specify that the audio content is to be played by the speaker at the same time as the additional audio content.

6. One or more non-transitory computer-readable media storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising:
- receiving a command from a network-based speech command service that specifies audio content;
  
  in response to receiving the command, playing the audio content at a first volume;
  
  receiving a speech message from the network-based speech command service;
  
  receiving, from the network-based speech command service, a first attribute associated with the audio content, specified by the network-based speech command service, and indicating whether to mix the audio content with the speech message, the first attribute comprising;
  
  first data specifying that the audio content is to be played while the speech message is played;
  
  orsecond data specifying that the audio content is to be paused while the speech message is played;
  
  receiving a second attribute associated with the speech message and specified by the network-based speech command service, the second attribute comprising;
  
  third data specifying that the audio content is to be attenuated while the speech message is played;
  
  orfourth data specifying that the audio content is to be paused while the speech message is played;
  
  playing the speech message;
  
  based at least in part on the first attribute comprising the first data and the second attribute comprising the third data, lowering playback volume of the audio content to a second volume while playing the speech message;
  
  based at least in part on the first attribute comprising the first data and the second attribute comprising the fourth data, pausing the playback of the audio content while playing the speech message; and
  
  based at least in part the first attribute comprising the second data, pausing the playing of the audio content while playing the speech message.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
- - 7. The one or more non-transitory computer-readable media of claim 6, further comprising:
    - conducting a dialog with a user, wherein conducting the dialog comprises playing the speech message and receiving speech from the user;
      
      lowering playback volume of the audio content to the second volume while conducting the dialog based at least in part on determining that the second attribute comprises the third data; and
      
      pausing the playing of the audio content while conducting the dialog based at least in part on determining that the second attribute comprises the fourth data.
  - 8. The one or more non-transitory computer-readable media of claim 6, further comprising:
    - based at least in part on the first attribute comprising the first data and the second attribute comprising the third data, lowering the playback volume to the second volume while conducting a dialog with a user, wherein conducting the dialog comprises playing the speech message and receiving speech from the user; and
      
      resuming playing the audio content at the first volume after conducting the dialog.
  - 9. The one or more non-transitory computer-readable media of claim 6, wherein:
    - the third data further specifies that the audio content is to be attenuated when the speech message is a declaration; and
      
      the fourth data further specifies that the audio content is to be paused when the speech message is a question.
  - 10. The one or more non-transitory computer-readable media of claim 6, further comprising streaming the audio content from a network-based content provider.
  - 11. The one or more non-transitory computer-readable media of claim 6, further comprising receiving the audio content from a local wireless device.
  - 12. The one or more non-transitory computer-readable media of claim 6, further comprising lowering the playback volume of the audio content to the second volume in response to detecting speech from a user.
  - 13. The one or more non-transitory computer-readable media of claim 6, wherein:
    - the third data includes one or more words describing an attenuate value; and
      
      the fourth data includes one or more words describing a pause value.

14. A computing device comprising:
- one or more processors;
  
  a speaker; and
  
  one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising;
  
  receiving, over a network and from a network-based speech command system, a first command to output first audio content;
  
  receiving, from the network-based speech command system, a mixing attribute specified by the network-based speech command system and indicating whether to mix the first audio content with additional audio content, the mixing attribute comprising;
  
  first data specifying that the first audio content is not to be played by the speaker at a same time as the additional audio content;
  
  orsecond data specifying that the first audio content is to be played by the speaker at the same time as the additional audio content;
  
  playing the first audio content on the speaker at a first volume;
  
  receiving a second command from the network-based speech command system to play second audio content, wherein the second audio content corresponds to a type of the additional audio content;
  
  receiving an interrupt attribute from the network-based speech command system, wherein the interrupt attribute corresponds to the second audio content, wherein the interrupt attribute is specified by the network-based speech command system and specifies whether the first audio content is to be attenuated or paused while the second audio content is played, the interrupt attribute comprising;
  
  third data specifying that the first audio content is to be attenuated while the second audio content is played;
  
  orfourth data specifying that the first audio content is to be paused while the second audio content is played;
  
  playing the second audio content on the speaker;
  
  based at least in part on the mixing attribute comprising the second data and the interrupt attribute comprising the third data, lowering playback volume of the first audio content to a second volume during playing of the second audio content; and
  
  based at least in part on one or more of the mixing attribute comprising the first data or the interrupt attribute comprising the fourth data, pausing the playing of the first audio content during playing of the second audio content.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The computing device of claim 14, wherein the second audio comprises a speech message.
  - 16. The computing device of claim 14, wherein the second audio content comprises a speech message that forms part of a speech dialog with a user.
  - 17. The computing device of claim 14, wherein the second audio content comprises a speech message that forms part of a speech dialog with a user, andthe operations further comprising resuming playing of the first audio content at the first volume after conducting the speech dialog.
  - 18. The computing device of claim 14, the operations further comprising streaming the first audio content from a network-based content provider.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Gundeti, Vikram Kumar, Torok, Fred, VanLund, Peter Spalding, Deramat, Frederic Johan Georges
Primary Examiner(s)
Serrou, Abdelali

Application Number

US14/107,931
Publication Number

US 20150170665A1
Time in Patent Office

1,709 Days
Field of Search
US Class Current
CPC Class Codes

G06F 3/165 Management of the audio str...

G06F 3/167 Audio in a user interface, ...

Attribute-based audio channel arbitration

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Attribute-based audio channel arbitration

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links