Device Selection for Providing a Response
First Claim
Patent Images
1. A system, comprising;
- a first speech processing pipeline instance that receives a first audio signal from a first speech interface device, the first audio signal representing a speech utterance, the first speech processing pipeline instance also receiving a first timestamp indicating a first time at which a wakeword was detected by the first speech interface device;
a second speech processing pipeline instance that receives a second audio signal from a second speech interface device, the second audio signal representing the speech utterance, the second speech processing pipeline also receiving a second timestamp indicating a second time at which the wakeword was detected by the second speech interface device;
the first speech processing pipeline instance having a series of processing components comprising;
an automatic speech recognition (ASR) component configured to analyze the first audio signal to determine words of the speech utterance;
a natural language understanding (NLU) component positioned in the first speech processing pipeline instance after the ASR component, the NLU component being configured to analyze the words of the speech utterance to determine an intent expressed by the speech utterance;
a response dispatcher positioned in the first speech processing pipeline instance after the NLU component, the response dispatcher being configured to specify a speech response to the speech utterance;
a first source arbiter positioned in the first speech processing pipeline instance before the ASR component, the first source arbiter being configured to determine (a) that an amount of time represented by a difference between the first timestamp and the second timestamp is less than a threshold;
(b) to determine that the first timestamp is greater than the second timestamp; and
(c) to abort the first speech processing pipeline instance.
1 Assignment
0 Petitions
Accused Products
Abstract
A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.
42 Citations
1 Claim
-
1. A system, comprising;
-
a first speech processing pipeline instance that receives a first audio signal from a first speech interface device, the first audio signal representing a speech utterance, the first speech processing pipeline instance also receiving a first timestamp indicating a first time at which a wakeword was detected by the first speech interface device; a second speech processing pipeline instance that receives a second audio signal from a second speech interface device, the second audio signal representing the speech utterance, the second speech processing pipeline also receiving a second timestamp indicating a second time at which the wakeword was detected by the second speech interface device; the first speech processing pipeline instance having a series of processing components comprising; an automatic speech recognition (ASR) component configured to analyze the first audio signal to determine words of the speech utterance; a natural language understanding (NLU) component positioned in the first speech processing pipeline instance after the ASR component, the NLU component being configured to analyze the words of the speech utterance to determine an intent expressed by the speech utterance; a response dispatcher positioned in the first speech processing pipeline instance after the NLU component, the response dispatcher being configured to specify a speech response to the speech utterance; a first source arbiter positioned in the first speech processing pipeline instance before the ASR component, the first source arbiter being configured to determine (a) that an amount of time represented by a difference between the first timestamp and the second timestamp is less than a threshold;
(b) to determine that the first timestamp is greater than the second timestamp; and
(c) to abort the first speech processing pipeline instance.
-
Specification