Identifying nodes already storing indicated input data to perform distributed execution of an indicated program in a node cluster
First Claim
1. A computer-implemented method comprising:
- receiving, by one or more computing systems configured to provide a program execution service that executes programs for multiple users by using a plurality of computing nodes provided by the program execution service, configuration information regarding using indicated input data as part of executing an indicated program for a first user of the multiple users in a distributed manner on a computing node cluster;
selecting, by the one or more configured computing systems, multiple computing nodes from the plurality for the computing node cluster based at least in part on the selected multiple computing nodes each being identified as locally storing at least some of the indicated input data prior to the receiving of the configuration information;
determining, by the one or more configured computing systems and based on at least one of the selected multiple computing nodes being currently unavailable to perform the executing of the indicated program due to executing one or more other programs for one or more other users and based on the executing of the indicated program being determined to have a higher priority than the executing of the one or more other programs, to terminate the executing of the one or more other programs on the at least one selected computing node to enable use of the at least one selected computing node in the executing of the indicated program; and
initiating, by the one or more configured computing systems and based at least in part on the terminating of the executing of the one or more other programs on the at least one selected computing node, the executing of the indicated program on the selected multiple computing nodes using the locally stored at least some indicated input data.
0 Assignments
0 Petitions
Accused Products
Abstract
Techniques are described for managing execution of programs, such as for distributed execution of a program on multiple computing nodes. In some situations, the techniques include selecting a cluster of computing nodes to use for executing a program based at least in part on data to be used during the program execution. For example, the computing node selection for a particular program may be performed so as to attempt to identify and use computing nodes that already locally store some or all of the input data that will be used by those computing nodes as part of the executing of that program on those nodes. Such techniques may provide benefits in a variety of situations, including when the size of input datasets to be used by a program are large, and the transferring of data to and/or from computing nodes may impose large delays and/or monetary costs.
24 Citations
22 Claims
-
1. A computer-implemented method comprising:
-
receiving, by one or more computing systems configured to provide a program execution service that executes programs for multiple users by using a plurality of computing nodes provided by the program execution service, configuration information regarding using indicated input data as part of executing an indicated program for a first user of the multiple users in a distributed manner on a computing node cluster; selecting, by the one or more configured computing systems, multiple computing nodes from the plurality for the computing node cluster based at least in part on the selected multiple computing nodes each being identified as locally storing at least some of the indicated input data prior to the receiving of the configuration information; determining, by the one or more configured computing systems and based on at least one of the selected multiple computing nodes being currently unavailable to perform the executing of the indicated program due to executing one or more other programs for one or more other users and based on the executing of the indicated program being determined to have a higher priority than the executing of the one or more other programs, to terminate the executing of the one or more other programs on the at least one selected computing node to enable use of the at least one selected computing node in the executing of the indicated program; and initiating, by the one or more configured computing systems and based at least in part on the terminating of the executing of the one or more other programs on the at least one selected computing node, the executing of the indicated program on the selected multiple computing nodes using the locally stored at least some indicated input data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer-readable medium having stored contents that configure a computing system to:
-
receive information that indicates a program to be executed for a first user by a program execution service on multiple computing nodes using indicated input data; identify, by the configured computing system, one or more computing nodes for executing the indicated program based at least in part on the one or more computing nodes already storing at least some of the indicated input data and being available for performing the executing of the indicated program, the identified one or more computing nodes being selected from a plurality of computing nodes used by the program execution service for executing programs; initiate, by the configured computing system, the executing of the indicated program on the identified one or more computing nodes so as to use the already stored at least some indicated input data; and initiate, by the configured computing system, the executing of the indicated program on an additional computing node that is identified as already storing at least some of the indicated input data, the initiating of the executing of the indicated program on the additional computing node including terminating execution of another program on the additional computing node for another user to enable the additional computing node to be available to perform the executing of the indicated program. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
one or more processors of one or more computing systems; and one or more components of a program execution service that, when executed by at least one of the one or more processors, cause the at least one processor to; receive a request to execute, for a first user that is one of multiple users for which the program execution service executes programs by using a plurality of computing nodes provided by the program execution service, an indicated program using indicated input data; select multiple computing nodes from the plurality to use for executing the indicated program based on the multiple computing nodes already storing at least some of the indicated input data before the receiving of the request; determine that at least one of the selected multiple computing nodes is currently unavailable to perform the executing of the indicated program due to executing one or more other programs for one or more other users, that the executing of the indicated program on the at least one selected computing node is preferred for the program execution service, and to terminate the executing of the one or more other programs on the at least one selected computing node to enable use of the at least one selected computing node in the executing of the indicated program; and initiate, based at least in part on the terminating of the executing of the one or more other programs on the at least one selected computing node, the executing of the indicated program on the multiple computing nodes to cause to cause use of the already stored at least some indicated input data during the executing of the indicated program. - View Dependent Claims (19, 20, 21, 22)
-
Specification