METHOD AND APPARATUS FOR TIME MANAGEMENT AND SCHEDULING FOR SYCHRONOUS PROCESSING ON A CLUSTER OF PROCESSING NODES
First Claim
1. A method for processing by a first node in a distributed computing system formed by a plurality of interconnected nodes, comprising:
- monitoring completion of jobs by other nodes in the distributed computing system; and
determining, after completing processing of a job in a current time interval, whether or not to start processing a job in a subsequent time interval based on at least one constraint and the monitored completion of jobs by other nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Certain aspects of the present disclosure provide techniques for time management and scheduling of synchronous neural processing on a cluster of processing nodes. A slip (or offset) may be introduced between processing nodes of a distributed processing system formed by a plurality of interconnected processing nodes, to enable faster nodes to continue processing without waiting for slower nodes to catch up. In certain aspects, a processing node, after completing each processing step, may check for received completion packets and apply a defined constraint to determine whether it may start processing a subsequent step or not.
-
Citations
26 Claims
-
1. A method for processing by a first node in a distributed computing system formed by a plurality of interconnected nodes, comprising:
-
monitoring completion of jobs by other nodes in the distributed computing system; and determining, after completing processing of a job in a current time interval, whether or not to start processing a job in a subsequent time interval based on at least one constraint and the monitored completion of jobs by other nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus for processing by a first node in a distributed computing system formed by a plurality of interconnected nodes, comprising:
-
means for monitoring completion of jobs by other nodes in the distributed computing system; and means for determining, after completing processing of a job in a current time interval, whether or not to start processing a job in a subsequent time interval based on at least one constraint and the monitored completion of jobs by other nodes. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. An apparatus for processing by a first node in a distributed computing system formed by a plurality of interconnected nodes, comprising:
-
at least one processor configured to monitor completion of jobs by other nodes in the distributed computing system and determine, after completing processing of a job in a current time interval, whether or not to start processing a job in a subsequent time interval based on at least one constraint and the monitored completion of jobs by other nodes; and a memory coupled with the at least one processor.
-
-
26. A computer program product for processing by a first node in a distributed computing system formed by a plurality of interconnected nodes, comprising computer readable medium having instructions stored thereon, the instructions executable by one or more processors for:
-
monitoring completion of jobs by other nodes in the distributed computing system; and determining, after completing processing of a job in a current time interval, whether or not to start processing a job in a subsequent time interval based on at least one constraint and the monitored completion of jobs by other nodes.
-
Specification