Voice detection by multiple devices
First Claim
1. A system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with multiple network microphone devices, wherein the multiple networked microphone devices (NMDs) are communicatively coupled to one another via a local area network, and wherein:
- each NMD is configured to perform operations comprising;
recording, via a respective microphone array, audio into a buffer of the respective NMD;
monitoring the recorded audio for wake-words; and
when a wake-word is detected in the recorded audio, sending, via a respective network interface to a voice assistant service, data representing an audio recording from the buffer of the respective NMD, the audio recording representing a portion of the recorded audio including the detected wake-word as recorded by the respective NMD; and
the one or more servers are configured to perform operations comprising;
receiving, via a network interface of the one or more servers, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises the detected wake-word;
based on respective sound pressure levels of the multiple audio recordings of the voice input, (i) selecting a particular NMD of the multiple NMDs and (ii) foregoing selection of other NMDs of the multiple NMDs; and
after the selecting, sending, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command following the wake-word in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are example techniques for voice detection by multiple NMDs. An example implementation may involve one or more servers receiving, via a network interface, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises a detected wake-word. Based on respective sound pressure levels of the multiple audio recordings of the voice input, the servers (i) select a particular NMD of the multiple NMDs and (ii) forego selection of other NMDs of the multiple NMDs. The servers send, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command.
-
Citations
21 Claims
-
1. A system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with multiple network microphone devices, wherein the multiple networked microphone devices (NMDs) are communicatively coupled to one another via a local area network, and wherein:
-
each NMD is configured to perform operations comprising; recording, via a respective microphone array, audio into a buffer of the respective NMD; monitoring the recorded audio for wake-words; and when a wake-word is detected in the recorded audio, sending, via a respective network interface to a voice assistant service, data representing an audio recording from the buffer of the respective NMD, the audio recording representing a portion of the recorded audio including the detected wake-word as recorded by the respective NMD; and the one or more servers are configured to perform operations comprising; receiving, via a network interface of the one or more servers, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises the detected wake-word; based on respective sound pressure levels of the multiple audio recordings of the voice input, (i) selecting a particular NMD of the multiple NMDs and (ii) foregoing selection of other NMDs of the multiple NMDs; and after the selecting, sending, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command following the wake-word in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method to be performed by one or more servers of a voice assistant service, the method comprising:
-
receiving, via a network interface of the one or more servers, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of multiple networked microphone devices (NMDs) connected via a local area network, wherein the voice input comprises a wake-word detected by the multiple NMDs; based on respective sound pressure levels of the multiple audio recordings of the voice input, (i) selecting a particular NMD of the multiple NMDs and (ii) foregoing selection of other NMDs of the multiple NMDs; and after the selecting, sending, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command following the wake-word in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A method to be performed by a system comprising one or more servers of a voice assistant service, wherein the one or more servers are configured to communicate with multiple network microphone devices, wherein the multiple networked microphone devices (NMDs) are communicatively coupled to one another via a local area network, and wherein:
-
each NMD is configured to perform operations comprising; recording, via a respective microphone array, audio into a buffer of the respective NMD; monitoring the recorded audio for wake-words; and when a wake-word is detected in the recorded audio, sending, via a respective network interface to a voice assistant service, data representing an audio recording from the buffer of the respective NMD, the audio recording representing a portion of the recorded audio including the detected wake-word as recorded by the respective NMD; and the method comprises; receiving, via a network interface of the one or more servers, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective NMD of the multiple NMDs, wherein the voice input comprises the detected wake-word; based on respective sound pressure levels of the multiple audio recordings of the voice input, (i) selecting a particular NMD of the multiple NMDs and (ii) foregoing selection of other NMDs of the multiple NMDs; and after the selecting, sending, via the network interface to the particular NMD, data representing a playback command that corresponds to a voice command following the wake-word in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular NMD to play back audio content according to the playback command via one or more amplifiers configured to drive one or more speakers. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
Specification