Selecting queries for execution on a stream of real-time data
First Claim
1. A computer-implemented method for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the method including:
- receiving a stream of near real-time data having data items located in different places in the stream;
between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query;
generating, during execution of the dataflow graph, one or more query results that satisfy the query;
generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more query results with one or more prior query results of one or more prior executions of the dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and
transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph.
3 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for executing a query on data items located at different places in a stream of near real-time data to provide near-real time intermediate results for the query, as the query is being executed, the method including: from time to time, executing, by one or more computer systems, the query on two or more of the data items located at different places in the stream, with the two or more data items being accessed in near real-time with respect to each of the two or more data items; generating information indicative of results of executing the query; and as the query continues being executed, generating intermediate results of query execution by aggregating the results with prior results of executing the query on data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results of query execution, prior to completion of execution of the query.
240 Citations
22 Claims
-
1. A computer-implemented method for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the method including:
-
receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more query results with one or more prior query results of one or more prior executions of the dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 20, 21)
-
-
2. A system for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the system including:
-
one or more processing devices; and one or more machine-readable hardware storage devices storing instructions that are executable by the one or more processing devices to perform operations including; receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more results with one or more prior query results of one or more prior executions of the query dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph. - View Dependent Claims (15, 16, 17, 18, 19, 22)
-
-
3. One or more machine-readable hardware storages for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the one or more machine-readable hardware storages storing instructions that are executable by one or more processing devices to perform operations including:
-
receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more query results with one or more prior query results of one or more prior executions of the dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph.
-
Specification