Data ingestion pipeline
First Claim
1. A computer-implemented method comprising:
- during a first period of time;
receiving, from a first remote device, first content data associated with a topic;
receiving, from a second remote device, second content data associated with the topic;
based on the first content data and the second content data both being associated with the topic, grouping the first content data and the second content data to generate first grouped data;
determining the first remote device and the second remote device correspond to a number of remote devices satisfying a threshold number of remote devices;
storing, based on the number satisfying the threshold number, the first grouped data as first stored data;
during a second period of time after the first period of time;
receiving, from a first device, input audio data corresponding to an utterance;
performing speech processing on the input audio data to determine a command corresponding to the topic;
determining, in a profile associated with the first device, a preferred content source associated with the topic;
determining the first remote device corresponds to the preferred content source;
performing text-to-speech (TTS) processing on the first content data to generate output audio data; and
causing the first device to emit audio corresponding to the output audio data.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for expanding system capabilities to execute user commands relating to trending topics (e.g., real-time news questions, trending questions, sports questions, game questions, politic questions, etc.) are described. The system gathers data from a variety of sources (e.g., news feeds, social media feeds, RSS feeds, news websites, etc.). The system segments gathered data corresponding to, for example, topic and or entity. The system may only store data corresponding to a topic or entity in a dedicated trending storage if the system receives data corresponding to the topic or entity from a number of different sources satisfying a threshold number of sources. Data in the dedicated trending storage may be maintained using decay models or algorithms. For example, the more often the system receives data corresponding to a topic or entity from one or more sources, the longer the data is maintained in the storage, and vice versa.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
during a first period of time; receiving, from a first remote device, first content data associated with a topic; receiving, from a second remote device, second content data associated with the topic; based on the first content data and the second content data both being associated with the topic, grouping the first content data and the second content data to generate first grouped data; determining the first remote device and the second remote device correspond to a number of remote devices satisfying a threshold number of remote devices; storing, based on the number satisfying the threshold number, the first grouped data as first stored data; during a second period of time after the first period of time; receiving, from a first device, input audio data corresponding to an utterance; performing speech processing on the input audio data to determine a command corresponding to the topic; determining, in a profile associated with the first device, a preferred content source associated with the topic; determining the first remote device corresponds to the preferred content source; performing text-to-speech (TTS) processing on the first content data to generate output audio data; and causing the first device to emit audio corresponding to the output audio data. - View Dependent Claims (2, 3, 4)
-
-
5. A system comprising:
-
at least one processor; and at least one memory including instructions that, when executed by the at least one processor, cause the system to; receive input data; perform speech processing on the input data to determine the input data corresponds to a topic; determine, from profile data associated with a device, a preferred content source associated with the topic; determine stored data corresponding to the topic, the stored data being received from a number of content sources satisfying a threshold number of content sources; determine at least a portion of the stored data received from the preferred content source; and cause the device to output content corresponding to the at least a portion. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method comprising:
-
receiving input data; performing speech processing on the input data to determine the input data corresponds to a topic; determining, from profile data associated with a device, a preferred content source associated with the topic; determining stored data corresponding to the topic, the stored data being received from a number of content sources satisfying a threshold number of content sources; determining at least a portion of the stored data received from the preferred content source; and causing the device to output content corresponding to the at least a portion. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification