Repartitioning parallel SVM computations using dynamic timeout
First Claim
1. A method for reducing execution time of a parallel support vector machine (SVM) application, comprising:
- partitioning an input data set into chunks of data;
distributing the partitioned chunks of data across a plurality of available computing nodes;
executing the parallel SVM application on the chunks of data in parallel across the plurality of available computing nodes;
computing a mean of completion times for a portion of the plurality of available computing nodes that have completed processing their respective chunks of data;
setting a first timeout period equal to a constant factor times the mean of the completion times minus a current elapsed time;
determining if the first timeout period has been exceeded before all of the plurality of available computing nodes have finished processing their respective chunks of data; and
if so,repartitioning the input data set into chunks of data that are different from the partitioned chunks of data;
redistributing the repartitioned chunks of data across some or all of the plurality of available computing nodes; and
executing the parallel SVM application on the repartitioned chunks of data in parallel across some or all of the available computing nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
A system that reduces execution time of a parallel SVM application. During operation, the system partitions an input data set into chunks of data. Next, the system distributes the partitioned chunks of data across a plurality of available computing nodes and executes the parallel SVM application on the chunks of data in parallel across the plurality of available computing nodes. The system then determines if a first timeout period has been exceeded before all of the plurality of available computing nodes have finished processing their respective chunks of data. If so, the system (1) repartitions the input data set into different chunks of data; (2) redistributes the repartitioned chunks of data across some or all of the plurality of available computing nodes; and (3) executes the parallel SVM application on the repartitioned chunks of data in parallel across some or all of the available computing nodes.
16 Citations
20 Claims
-
1. A method for reducing execution time of a parallel support vector machine (SVM) application, comprising:
-
partitioning an input data set into chunks of data; distributing the partitioned chunks of data across a plurality of available computing nodes; executing the parallel SVM application on the chunks of data in parallel across the plurality of available computing nodes; computing a mean of completion times for a portion of the plurality of available computing nodes that have completed processing their respective chunks of data; setting a first timeout period equal to a constant factor times the mean of the completion times minus a current elapsed time; determining if the first timeout period has been exceeded before all of the plurality of available computing nodes have finished processing their respective chunks of data; and if so, repartitioning the input data set into chunks of data that are different from the partitioned chunks of data; redistributing the repartitioned chunks of data across some or all of the plurality of available computing nodes; and executing the parallel SVM application on the repartitioned chunks of data in parallel across some or all of the available computing nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for reducing execution time of a parallel support vector machine (SVM) application, the method comprising:
-
partitioning an input data set into chunks of data; distributing the partitioned chunks of data across a plurality of available computing nodes; executing the parallel SVM application on the chunks of data in parallel across the plurality of available computing nodes; computing a mean of completion times for a portion of the plurality of available computing nodes that have completed processing their respective chunks of data; setting a first timeout period equal to a constant factor times the mean of the completion times minus a current elapsed time; determining if the first timeout period has been exceeded before all of the plurality of available computing nodes have finished processing their respective chunks of data; and if so, repartitioning the input data set into chunks of data that are different from the partitioned chunks of data; redistributing the repartitioned chunks of data across some or all of the plurality of available computing nodes; and executing the parallel SVM application on the repartitioned chunks of data in parallel across some or all of the available computing nodes. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus that reduces execution time of a parallel support vector machine (SVM) application, comprising:
-
a processor; and processor, wherein the processor is configured to; a memory coupled to the processor; partition an input data set into chunks of data; distribute the partitioned chunks of data across a plurality of available computing nodes; execute the parallel SVM application on the chunks of data in parallel across the plurality of available computing nodes; compute a mean of completion times for a portion of the plurality of available computing nodes that have completed processing their respective chunks of data; set a first timeout period equal to a constant factor times the mean of the completion times minus a current elapsed time; and determine if the first timeout period has been exceeded before all of the plurality of available computing nodes have finished processing their respective chunks of data; and if so, to repartition the input data set into chunks of data that are different from the partitioned chunks of data; redistribute the repartitioned chunks of data across some or all of the plurality of available computing nodes; and
toexecute the parallel SVM application on the repartitioned chunks of data in parallel across some or all of the available computing nodes. - View Dependent Claims (18, 19, 20)
-
Specification