Contextualization of voice inputs
First Claim
1. Tangible, non-transitory, computer-readable media having instructions encoded therein, wherein the instructions, when executed by one or more processors, cause a first networked microphone device (NMD) device to perform a method comprising:
- recording, via a microphone array of the first NMD, audio data indicating a voice command;
identifying, based on the recorded audio data, a first characteristic of the voice command, the first characteristic comprising a sound pressure level of the voice command as detected by the microphone array of the first NMD, wherein the first NMD is associated with a first zone of a media playback system, the first zone comprising a first playback device;
receiving, via a network interface of the first NMD from one or more second NMDs, contextual information indicating second characteristics of the voice command, the second characteristics comprising respective sound pressure levels of the voice command as detected by respective microphone arrays of the one or more second NMDs, wherein the one or more second NMDs are associated with one or more second zones of the media playback system, each second zone comprising a second playback device;
based on the sound pressure level of the voice command as detected by the microphone array of the first NMD being greater than the sound pressure levels of the voice command as detected by the respective microphone arrays of the one or more second NMDs, determining that the voice command was uttered in the first zone;
in response to determining that the voice command was uttered in the first zone associated with the first NMD, querying, via the network interface, one or more servers of a voice assistant service with the voice command;
receiving, via the network interface in response to the query, a playback command corresponding to the voice command; and
instructing the first playback device to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers.
4 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are example techniques to provide contextual information corresponding to a voice command. An example implementation may involve receiving voice data indicating a voice command, receiving contextual information indicating a characteristic of the voice command, and determining a device operation corresponding to the voice command. Determining the device operation corresponding to the voice command may include identifying, among multiple zones of a media playback system, a zone that corresponds to the characteristic of the voice command, and determining that the voice command corresponds to one or more particular devices that are associated with the identified zone. The example implementation may further involve causing the one or more particular devices to perform the device operation.
-
Citations
20 Claims
-
1. Tangible, non-transitory, computer-readable media having instructions encoded therein, wherein the instructions, when executed by one or more processors, cause a first networked microphone device (NMD) device to perform a method comprising:
-
recording, via a microphone array of the first NMD, audio data indicating a voice command; identifying, based on the recorded audio data, a first characteristic of the voice command, the first characteristic comprising a sound pressure level of the voice command as detected by the microphone array of the first NMD, wherein the first NMD is associated with a first zone of a media playback system, the first zone comprising a first playback device; receiving, via a network interface of the first NMD from one or more second NMDs, contextual information indicating second characteristics of the voice command, the second characteristics comprising respective sound pressure levels of the voice command as detected by respective microphone arrays of the one or more second NMDs, wherein the one or more second NMDs are associated with one or more second zones of the media playback system, each second zone comprising a second playback device; based on the sound pressure level of the voice command as detected by the microphone array of the first NMD being greater than the sound pressure levels of the voice command as detected by the respective microphone arrays of the one or more second NMDs, determining that the voice command was uttered in the first zone; in response to determining that the voice command was uttered in the first zone associated with the first NMD, querying, via the network interface, one or more servers of a voice assistant service with the voice command; receiving, via the network interface in response to the query, a playback command corresponding to the voice command; and instructing the first playback device to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
2. A first networked microphone device (NMD), the first NMD comprising:
-
a microphone array; a network interface; one or more processors; and computer-readable media having instructions encoded therein, wherein the instructions, when executed by the one or more processors, cause the first NMD device to perform functions comprising; recording, via the microphone array, audio data indicating a voice command; identifying, based on the recorded audio data, a first characteristic of the voice command, the first characteristic comprising a sound pressure level of the voice command as detected by the microphone array of the first NMD, wherein the first NMD is associated with a first zone of a media playback system, the first zone comprising a first playback device; receiving, via a network interface of the first NMD from one or more second NMDs, contextual information indicating a second characteristic of the voice command, the second characteristics comprising respective sound pressure levels of the voice command as detected by respective microphone arrays of the one or more second NMDs, wherein the one or more second NMDs are associated with one or more second zones of the media playback system, each second zone comprising a second playback device; based on the sound pressure level of the voice command as detected by the microphone array of the first NMD being greater than the sound pressure levels of the voice command as detected by the respective microphone arrays of the one or more second NMDs, determining that the voice command was uttered in the first zone; in response to determining that the voice command was uttered in the first zone associated with the first NMD, querying one or more servers of a voice assistant service with the voice command; receiving, via the network interface in response to the query, a playback command corresponding to the voice command; and instructing the first playback device to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
3. A method comprising:
-
recording, via a microphone array of a first networked microphone device (NMD), audio data indicating a voice command; identifying, based on the recorded audio data, a first characteristic of the voice command, the first characteristic comprising a sound pressure level of the voice command as detected by the microphone array of the first NMD, wherein the first NMD is associated with a first zone of a media playback system, the first zone comprising a first playback device; receiving, via a network interface of the first NMD from one or more second NMDs, contextual information indicating a second characteristic of the voice command, the second characteristics comprising respective sound pressure levels of the voice command as detected by respective microphone arrays of the one or more second NMDs, wherein the one or more second NMDs are associated with one or more second zones of the media playback system, each second zone comprising a second playback device; based on the sound pressure level of the voice command as detected by the microphone array of the first NMD being greater than the sound pressure levels of the voice command as detected by the respective microphone arrays of the one or more second NMDs, determining that the voice command was uttered in the first zone; in response to determining that the voice command was uttered in the first zone associated with the first NMD, querying, via the network interface, one or more servers of a voice assistant service with the voice command; receiving, via the network interface in response to the query, a playback command corresponding to the voice command; and instructing the first playback device to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (4, 5, 6, 7, 8)
-
Specification