Controlling distributed audio outputs to enable voice output
First Claim
1. A computer-implemented method for controlling a speaker system, the method comprising:
- associating, by at least one server device, an audio device with a first wireless speaker;
receiving, from the audio device by the at least one server device, input audio data corresponding to an utterance;
performing, by the at least one server device, speech processing on the input audio data to determine a first instruction;
generating, by the at least one server device, voice output audio data that includes synthesized speech corresponding to the first instruction;
determining that the first wireless speaker is associated with the audio device;
determining that the first wireless speaker is outputting first audio;
sending a second instruction to the network-connected device to cause the network-connected device to reduce a volume level of the first audio from a first level to a second level; and
sending, by the at least one server device, the voice output audio data to the audio device to cause the audio device to generate, while the first audio is outputting at the second level, second audio using a speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
A system that is capable of controlling multiple entertainment systems and/or speakers using voice commands. The system receives voice commands and may determine speakers playing output audio in proximity to the voice commands. The system may generate voice output and send the voice output to the speakers, along with a command to reduce a volume of output audio while playing the voice output. For example, the system may receive a voice command from an input device associated with an output zone, may reduce output audio generated by speakers in the output zone and may play the voice output via the speakers. In addition, the system may send the command to the speakers while sending the voice output to another device for playback. For example, the system may reduce output audio generated by the speakers and play the voice output via the input device.
-
Citations
18 Claims
-
1. A computer-implemented method for controlling a speaker system, the method comprising:
-
associating, by at least one server device, an audio device with a first wireless speaker; receiving, from the audio device by the at least one server device, input audio data corresponding to an utterance; performing, by the at least one server device, speech processing on the input audio data to determine a first instruction; generating, by the at least one server device, voice output audio data that includes synthesized speech corresponding to the first instruction; determining that the first wireless speaker is associated with the audio device; determining that the first wireless speaker is outputting first audio; sending a second instruction to the network-connected device to cause the network-connected device to reduce a volume level of the first audio from a first level to a second level; and sending, by the at least one server device, the voice output audio data to the audio device to cause the audio device to generate, while the first audio is outputting at the second level, second audio using a speaker. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method comprising:
-
receiving, from an audio device by at least one server device, input audio data corresponding to an utterance; performing, by the at least one server device, speech processing on the input audio data to determine a first instruction; generating, by the at least one server device, voice output audio data that includes synthesized speech corresponding to the first instruction; determining a first output device associated with the audio device, the first output device controllable by a network-connected device; determining that the first output device is outputting first audio; sending a second instruction to the network-connected device to cause the network-connected device to reduce a volume level of the first audio from a first level to a second level; and sending, by the at least one server device, the voice output audio data to the audio device to cause the audio device to output, while the first audio is outputting at the second level, second audio generated from the voice output audio data. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A system, comprising:
-
at least one processor; a memory including instructions operable to be executed by the at least one processor to configure the system to; receive, from an audio device, input audio data corresponding to an utterance; perform speech processing on the input audio data to determine a first instruction; generate voice output audio data that includes synthesized speech corresponding to the first instruction; determine a first output device associated with the audio device, the first output device controllable by a network-connected device; determine that the first output device is outputting first audio; send a second instruction to the network-connected device to cause the network-connected device to reduce a volume level of the first audio from a first level to a second level; and send the voice output audio data to the audio device to cause the audio device to output, while the first audio is outputting at the second level, second audio generated from the voice output audio data. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
Specification