Distributed stream processing in the cloud
First Claim
1. A system comprising:
- at least one processor;
a memory connected to the at least one processor; and
a cloud-scale query execution platform that supports distributed stream processing comprising a streaming job manager that monitors execution information about streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, the streaming job manager receiving execution progress information and data dependencies for the plurality of vertices, each vertex of the plurality of vertices configured to process events associated with one or more streaming jobs, and each event being labeled with a sequence number used by at least the streaming job manager to describe and track dependencies between input, output and state of a vertex.
3 Assignments
0 Petitions
Accused Products
Abstract
A low-latency cloud-scale computation environment includes a query language, optimization, scheduling, fault tolerance and fault recovery. An event model can be used to extend a declarative query language so that temporal analysis of event of an event stream can be performed. Extractors and outputters can be used to define and implement functions that extend the capabilities of the event-based query language. A script written in the extended query language can be translated into an optimal parallel continuous execution plan. Execution of the plan can be orchestrated by a streaming job manager which schedules vertices on available computing machines. The streaming job manager can monitor overall job execution. Fault tolerance can be provided by tracking execution progress and data dependencies in each vertex. In the event of a failure, another instance of the failed vertex can be scheduled. An optimal recovery point can be determined based on checkpoints and data dependencies.
21 Citations
20 Claims
-
1. A system comprising:
-
at least one processor; a memory connected to the at least one processor; and a cloud-scale query execution platform that supports distributed stream processing comprising a streaming job manager that monitors execution information about streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, the streaming job manager receiving execution progress information and data dependencies for the plurality of vertices, each vertex of the plurality of vertices configured to process events associated with one or more streaming jobs, and each event being labeled with a sequence number used by at least the streaming job manager to describe and track dependencies between input, output and state of a vertex. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
receiving by a processor of a computing device, execution progress information associated with a plurality of streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, each vertex of the plurality of vertices configured to process events associated with one or more of the plurality of streaming jobs, and each event being assigned a sequence number that describes and tracks dependencies between input, output and state of at least one vertex of the plurality of vertices; in response to detecting a vertex failure among the plurality of vertices, scheduling a new vertex; and determining a closest checkpoint from which to resume processing on the new vertex from the sequence numbers assigned to the events in the streaming jobs. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer-readable storage medium comprising computer-readable instructions which when executed cause at least one processor of a computing device to:
-
receive data dependency information associated with a plurality of streaming jobs executed by a plurality of vertices executing on a plurality of computing devices, each vertex of the plurality of vertices configured to process events associated with one or more of the plurality of streaming jobs, and each event being assigned a sequence number that describes and tracks dependencies between input, output and state of at least one vertex of the plurality of vertices; in response to detecting a vertex failure among the plurality of vertices, perform job recovery by scheduling a new vertex; and determine a closest checkpoint from which to resume processing on the new vertex using the sequence numbers assigned to the events in one or more of the plurality of streaming jobs. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification