Systems and methods for routing content to an associated output device
First Claim
Patent Images
1. A method, comprising:
- with a backend system;
receiving first request audio data representing a first utterance, the first request audio data received from a voice activated electronic device,receiving a customer identifier associated with the voice activated electronic device,determining a user account associated with the customer identifier,generating first text data representing the first request audio data by executing speech-to-text functionality on the first request audio data, anddetermining, using the first text data, that a first intent of the first utterance is for information to be output by a target device;
determining that an output device that is capable of presenting visual data is also associated with the user account;
determining that a visual information response to the first utterance is available;
determining that the target device is the output device such that the visual information response is to be displayed by a display screen of the output device;
determining that a first audio response to the first utterance is to be sent to the voice activated electronic device;
determining that a second audio response to the first utterance is to be sent to the output device;
determining that a video response to the first utterance is also to be sent to the output device;
generating first response text data responsive to the first utterance;
generating first audio data representing the first response text data by executing text-to-speech functionality on the first response text data;
sending the first audio data to the voice activated electronic device, such that the first audio response is played by a first speaker of the voice activated electronic device;
generating second response text data responsive to the first utterance, including receiving at least a portion of the second response text data from an application;
generating second audio data representing the second response text data by executing text-to-speech functionality on the second response text data;
generating video data responsive to the first utterance;
sending the second audio data to the output device, such that the second audio response is played by a second speaker of the output device; and
sending the video data to the output device, such that the video response is played by the display screen of the output device.
1 Assignment
0 Petitions
Accused Products
Abstract
Devices and methods for routing content are provided herein. In some embodiments, a method for routing content include receiving audio data representing a command from a first electronic device, determining content that is associated with the command, sending responsive audio data to the first electronic device, and sending instructions to the second electronic device to output the content associated with the command. In some embodiments, a method for routing contents includes determining a state of the second electronic device and sending instructions to output the content to a selected one of the first and second electronic devices based on the state of the second electronic device.
47 Citations
22 Claims
-
1. A method, comprising:
-
with a backend system; receiving first request audio data representing a first utterance, the first request audio data received from a voice activated electronic device, receiving a customer identifier associated with the voice activated electronic device, determining a user account associated with the customer identifier, generating first text data representing the first request audio data by executing speech-to-text functionality on the first request audio data, and determining, using the first text data, that a first intent of the first utterance is for information to be output by a target device; determining that an output device that is capable of presenting visual data is also associated with the user account; determining that a visual information response to the first utterance is available; determining that the target device is the output device such that the visual information response is to be displayed by a display screen of the output device; determining that a first audio response to the first utterance is to be sent to the voice activated electronic device; determining that a second audio response to the first utterance is to be sent to the output device; determining that a video response to the first utterance is also to be sent to the output device; generating first response text data responsive to the first utterance; generating first audio data representing the first response text data by executing text-to-speech functionality on the first response text data; sending the first audio data to the voice activated electronic device, such that the first audio response is played by a first speaker of the voice activated electronic device; generating second response text data responsive to the first utterance, including receiving at least a portion of the second response text data from an application; generating second audio data representing the second response text data by executing text-to-speech functionality on the second response text data; generating video data responsive to the first utterance; sending the second audio data to the output device, such that the second audio response is played by a second speaker of the output device; and sending the video data to the output device, such that the video response is played by the display screen of the output device. - View Dependent Claims (2, 3)
-
-
4. A method performed by at least one backend system, comprising:
-
receiving, from a first electronic device, first audio data representing a first utterance; determining that a first user account is associated with the first electronic device; generating first text data representing the first audio data; determining, using the first text data, a first intent of the first utterance; determining that a second electronic device is also associated with the user account; determining that a first response to the first utterance is capable of being sent to the second electronic device; generating second text data representing a second response to the first utterance; generating second audio data representing the second text data; sending the second audio data to the first electronic device, such that the second response is output by a speaker associated with the first electronic device; generating first image data representing the first response; and sending the first image data to the second electronic device such that the first response is output on a display screen associated with the second electronic device. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 22)
-
-
13. At least one backend system, comprising:
-
at least one processor; and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the backend system to; receive first audio data representing a first utterance from a first electronic device, determine that the first electronic device is associated with a first user account, generate first text data representing the first audio data, determine, using the first text data, a first intent of the first utterance, determine that a second electronic device is associated with the user account, determine that a first response to the first utterance is capable of being sent to the second electronic device, generate second text data representing a second response to the first utterance, generate second audio data representing the second text data, send the second audio data to the first electronic device, such that the second response is output by a speaker associated with the first electronic device, generate first image data representing the first response, and send the first image data to the second electronic device such that the first response is output on a display screen associated with the second electronic device. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
Specification