Home graph
First Claim
1. A system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with a network microphone device (NMD) of a media playback system comprising multiple devices connected via a local area network,wherein the NMD is configured to perform operations comprising:
- recording, via a microphone array, audio into a buffer;
monitoring the recorded audio for wake-words; and
when a wake-word is detected in the recorded audio, sending, via a network interface to the voice assistant service, data representing an audio recording from the buffer of the NMD, the audio recording comprising a voice input following the detected wake-word within the buffer; and
wherein the one or more servers are configured to perform operations comprising;
storing a data structure comprising nodes in a hierarchy representing the media playback system, wherein the data structure comprises (i) a root node representing the media playback system as a Home of the hierarchy, (ii) one or more first nodes in a first level, the first nodes representing respective devices of the media playback system as Sets of the hierarchy, and (ii) one or more second nodes in a second level as parents to one or more respective child first nodes to represent Sets in respective Rooms of the hierarchy, wherein the nodes in the hierarchy are assigned respective names;
receiving, via a network interface of the one or more servers, data representing the audio recording;
processing the audio recording to determine one or more voice commands within the voice input, wherein processing the audio recording comprises;
determining, based on the data structure representing the media playback system, that one or more first voice commands within the voice input represent respective target variables indicating one or more particular nodes of the data structure, each target variable referencing a name of a respective node of the data structure; and
determining that one or more second voice commands within the voice input correspond to one or more playback commands; and
causing, via the network interface of the one or more servers, one or more particular playback devices to play back audio content according to the one or more playback commands, wherein the one or more particular playback devices include (a) all playback devices represented by the one or more particular nodes of the data structure and (b) all playback devices represented by child nodes of the one or more particular nodes of the data structure.
2 Assignments
0 Petitions
Accused Products
Abstract
Example techniques involve a control hierarchy for a “smart” home having smart appliances and related devices, such as wireless illumination devices, home-automation devices (e.g., thermostats, door locks, etc.), and audio playback devices, among others. An example home includes various rooms in which smart devices might be located. Under the example control hierarchy described herein and referred to as “home graph,” a name of a room (e.g., “Kitchen”) may represent a smart device (or smart devices) within that room. In other words, from the perspective of a user, the smart devices within a room are that room. This hierarchy permits a user to refer to a smart device within a given room by way of the name of the room when controlling smart devices within the home using a voice user interface (VUI) or graphical user interface (GUI).
439 Citations
20 Claims
-
1. A system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with a network microphone device (NMD) of a media playback system comprising multiple devices connected via a local area network,
wherein the NMD is configured to perform operations comprising: -
recording, via a microphone array, audio into a buffer; monitoring the recorded audio for wake-words; and when a wake-word is detected in the recorded audio, sending, via a network interface to the voice assistant service, data representing an audio recording from the buffer of the NMD, the audio recording comprising a voice input following the detected wake-word within the buffer; and wherein the one or more servers are configured to perform operations comprising; storing a data structure comprising nodes in a hierarchy representing the media playback system, wherein the data structure comprises (i) a root node representing the media playback system as a Home of the hierarchy, (ii) one or more first nodes in a first level, the first nodes representing respective devices of the media playback system as Sets of the hierarchy, and (ii) one or more second nodes in a second level as parents to one or more respective child first nodes to represent Sets in respective Rooms of the hierarchy, wherein the nodes in the hierarchy are assigned respective names; receiving, via a network interface of the one or more servers, data representing the audio recording; processing the audio recording to determine one or more voice commands within the voice input, wherein processing the audio recording comprises; determining, based on the data structure representing the media playback system, that one or more first voice commands within the voice input represent respective target variables indicating one or more particular nodes of the data structure, each target variable referencing a name of a respective node of the data structure; and determining that one or more second voice commands within the voice input correspond to one or more playback commands; and causing, via the network interface of the one or more servers, one or more particular playback devices to play back audio content according to the one or more playback commands, wherein the one or more particular playback devices include (a) all playback devices represented by the one or more particular nodes of the data structure and (b) all playback devices represented by child nodes of the one or more particular nodes of the data structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method to be performed by a system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with a network microphone device (NMD) of a media playback system comprising multiple devices connected via a local area network, wherein the NMD is configured to perform operations comprising:
-
recording, via a microphone array, audio into a buffer; monitoring the recorded audio for wake-words; and when a wake-word is detected in the recorded audio, sending, via a network interface to the voice assistant service, data representing an audio recording from the buffer of the NMD, the audio recording comprising a voice input following the detected wake-word within the buffer; and wherein the method comprises; the one or more servers storing a data structure comprising nodes in a hierarchy representing the media playback system, wherein the data structure comprises (i) a root node representing the media playback system as a Home of the hierarchy, (ii) one or more first nodes in a first level, the first nodes representing respective devices of the media playback system as Sets of the hierarchy, and (ii) one or more second nodes in a second level as parents to one or more respective child first nodes to represent Sets in respective Rooms of the hierarchy, wherein the nodes in the hierarchy are assigned respective names; the one or more servers receiving, via a network interface of the one or more servers, data representing the audio recording; the one or more servers processing the audio recording to determine one or more voice commands within the voice input, wherein processing the audio recording comprises; determining, based on the data structure representing the media playback system, that one or more first voice commands within the voice input represent respective target variables indicating one or more particular nodes of the data structure, each target variable referencing a name of a respective node of the data structure; and determining that one or more second voice commands within the voice input correspond to one or more playback commands; and the one or more servers causing, via the network interface of the one or more servers, one or more particular playback devices to play back audio content according to the one or more playback commands, wherein the one or more particular playback devices include (a) all playback devices represented by the one or more particular nodes of the data structure and (b) all playback devices represented by child nodes of the one or more particular nodes of the data structure. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method to be performed by a system comprising one or more servers of a voice assistant service and a network microphone device (NMD) of a media playback system comprising multiple devices connected via a local area network, wherein the method comprises:
-
the NMD recording, via a microphone array, audio into a buffer; the NMD monitoring the recorded audio for wake-words; and when a wake-word is detected in the recorded audio, the NMD sending, via a network interface to the voice assistant service, data representing an audio recording from the buffer of the NMD, the audio recording comprising a voice input following the detected wake-word within the buffer; and the one or more servers storing a data structure comprising nodes in a hierarchy representing the media playback system, wherein the data structure comprises (i) a root node representing the media playback system as a Home of the hierarchy, (ii) one or more first nodes in a first level, the first nodes representing respective devices of the media playback system as Sets of the hierarchy, and (ii) one or more second nodes in a second level as parents to one or more respective child first nodes to represent Sets in respective Rooms of the hierarchy, wherein the nodes in the hierarchy are assigned respective names; the one or more servers receiving, via a network interface of the one or more servers, data representing the audio recording; the one or more servers processing the audio recording to determine one or more voice commands within the voice input, wherein processing the audio recording comprises; determining, based on the data structure representing the media playback system, that one or more first voice commands within the voice input represent respective target variables indicating one or more particular nodes of the data structure, each target variable referencing a name of a respective node of the data structure; and determining that one or more second voice commands within the voice input correspond to one or more playback commands; and the one or more servers causing, via the network interface of the one or more servers, one or more particular playback devices to play back audio content according to the one or more playback commands, wherein the one or more particular playback devices include (a) all playback devices represented by the one or more particular nodes of the data structure and (b) all playback devices represented by child nodes of the one or more particular nodes of the data structure.
-
Specification