Real-time anomaly mitigation in a cloud-based video streaming system
First Claim
1. A method executed by an electronic device in a video streaming platform including at least a central telemetry system, the method comprising:
- receiving at the central telemetry system performance data from at least one local telemetry agent of a worker, where the worker is from a set of workers of the video streaming platform, wherein each worker in the set of workers executes tasks in a task graph of a media workflow created for a video source, where each worker in the set of workers is a processing unit in the video streaming platform, wherein the performance data from each worker is generated during execution of the worker, and wherein the performance data is indicative of operational status of the set of workers;
processing the performance data at the central telemetry system, wherein the processing includes generating task-specific monitoring data based on the performance data;
identifying at the central telemetry system whether the performance data or the task-specific monitoring data contains an anomaly, where the anomaly includes a failure of a component of the video streaming platform which the media workflow utilizes; and
upon the anomaly being identified, mitigating the anomaly by interacting with the set of workers by moving tasks from a worker associated with the anomaly to other workers of the streaming platform and preventing all tasks from running on the worker.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for detect and mitigate anomaly in video streaming platforms is disclosed. In one embodiment, performance data from a set of workers is received at a central telemetry system (CTS), where the performance data is indicative of operational status of the set of workers. The CTS processes the performance data, including generating task-specific monitoring data based on the performance data, and it identifies whether the performance data or the task-specific monitoring data contains any anomaly. Upon an anomaly being identified, the CTS mitigates the anomaly by interacting with the set of workers.
86 Citations
18 Claims
-
1. A method executed by an electronic device in a video streaming platform including at least a central telemetry system, the method comprising:
-
receiving at the central telemetry system performance data from at least one local telemetry agent of a worker, where the worker is from a set of workers of the video streaming platform, wherein each worker in the set of workers executes tasks in a task graph of a media workflow created for a video source, where each worker in the set of workers is a processing unit in the video streaming platform, wherein the performance data from each worker is generated during execution of the worker, and wherein the performance data is indicative of operational status of the set of workers; processing the performance data at the central telemetry system, wherein the processing includes generating task-specific monitoring data based on the performance data; identifying at the central telemetry system whether the performance data or the task-specific monitoring data contains an anomaly, where the anomaly includes a failure of a component of the video streaming platform which the media workflow utilizes; and upon the anomaly being identified, mitigating the anomaly by interacting with the set of workers by moving tasks from a worker associated with the anomaly to other workers of the streaming platform and preventing all tasks from running on the worker. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An electronic device to serve as a central telemetry system in a video streaming platform, the electronic device comprising:
a processor and a non-transitory machine-readable storage medium coupled to the processor, the non-transitory machine-readable storage medium containing operations executable by the processor, wherein the electronic device is operative to;
receive at the central telemetry system performance data from at least one local telemetry agent of a worker, where the worker is from a set of workers of the video streaming platform, wherein each worker in the set of workers executes tasks in a task graph of a media workflow created for a video source, where each worker in the set of workers is a processing unit in the video streaming platform, wherein the performance data from each worker is generated during execution of the worker, and wherein the performance data is indicative of operational status of the set of workers;
process the performance data at the central telemetry system, wherein the processing includes generating task-specific monitoring data based on the performance data;
identify at the central telemetry system whether the performance data or the task-specific monitoring data contains an anomaly, where the anomaly includes a failure of a component of the video streaming platform that the media workflow utilizes; and
upon the anomaly being identified, mitigate the anomaly by interacting with the set of workers by moving tasks from a worker associated with the anomaly to other workers of the streaming platform and preventing all tasks from running on the worker.- View Dependent Claims (10, 11, 12, 13, 14)
-
15. A non-transitory machine-readable storage medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations in an electronic device serving as a central telemetry system in a video streaming platform, the operations comprising:
-
receiving at the central telemetry system performance data from at least one local telemetry agent of a worker, where the worker is from a set of workers of the video streaming platform, wherein each worker in the set of workers executes tasks in a task graph of a media workflow created for a video source, where each worker in the set of workers is a processing unit in the video streaming platform, wherein the performance data from each worker is generated during execution of the worker, and wherein the performance data is indicative of operational status of the set of workers; processing the performance data at the central telemetry system, wherein the processing includes generating task-specific monitoring data based on the performance data; identifying at the central telemetry system whether the performance data or the task-specific monitoring data contains an anomaly, where the anomaly includes a failure of a component of the video streaming platform which the media workflow utilizes; and upon the anomaly being identified, mitigating the anomaly by interacting with the set of workers by moving tasks from a worker associated with the anomaly to other workers of the streaming platform and preventing all tasks from running on the worker. - View Dependent Claims (16, 17, 18)
-
Specification