Processing spoken commands to control distributed audio outputs
First Claim
1. A computer-implemented method comprising:
- receiving, from an input device, input data corresponding to an utterance;
determining, using at least one server device, that the input device corresponds to a first location;
determining, using the at least one server device, that an output system corresponds to the first location;
determining that the output system is outputting audio;
based at least in part on receiving the input data corresponding to the utterance and determining that the output system is outputting audio, sending, from the at least one server device to the output system, a first instruction to cause a decrease in volume of the audio;
after sending the first instruction, determining that the utterance has concluded; and
after determining the utterance has concluded, sending, to the output system, a second instruction indicating the utterance has concluded.
1 Assignment
0 Petitions
Accused Products
Abstract
A system that is capable of controlling multiple entertainment systems and/or speakers using voice commands. The system receives voice commands and may determine audio sources and speakers indicated by the voice commands. The system may generate audio data from the audio sources and may send the audio data to the speakers using multiple interfaces. For example, the system may send the audio data directly to the speakers using a network address, may send the audio data to the speakers via a voice-enabled device or may send the audio data to the speakers via a speaker controller. The system may generate output zones including multiple speakers and may associate input devices with speakers within the output zones. For example, the system may receive a voice command from an input device in an output zone and may reduce output audio generated by speakers in the output zone.
39 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving, from an input device, input data corresponding to an utterance; determining, using at least one server device, that the input device corresponds to a first location; determining, using the at least one server device, that an output system corresponds to the first location; determining that the output system is outputting audio; based at least in part on receiving the input data corresponding to the utterance and determining that the output system is outputting audio, sending, from the at least one server device to the output system, a first instruction to cause a decrease in volume of the audio; after sending the first instruction, determining that the utterance has concluded; and after determining the utterance has concluded, sending, to the output system, a second instruction indicating the utterance has concluded. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
at least one processor; and at least one memory including instructions that, when executed by the at least one processor, cause the system to; receive, from an input device, input data corresponding to an utterance; determine that the input device corresponds to a first location; determine that an output system corresponds to the first location; determine that the output system is outputting audio; based at least in part on receiving the input data corresponding to the utterance and determining that the output system is outputting audio, sending, to the output system, a first instruction to cause a decrease in volume of the audio; after sending the first instruction, determine that the utterance has concluded; and after determining the utterance has concluded, send, to the output system, a second instruction indicating the utterance has concluded. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification