PROVISIONING OF DISTRIBUTED COMPUTING CLUSTERS
First Claim
1. A method for provisioning a cluster for a distributed computing platform, the method comprising:
- receiving configuration information for a cluster from a user, wherein the configuration information comprises at least a cluster size, a data set and code for processing the data set from the user;
selecting a plurality of target host computing devices from a plurality of host computing devices based on the configuration information;
instantiating, based on the cluster size, at least one virtual machine (VM) on each of the target host computing devices to serve as a node of the cluster, wherein each instantiated VM is configured to access a virtual disk that is based on a VM template and preconfigured with code for executing functionality of the distributed computing platform;
persistently storing the data set in a distributed file system accessible by at least a subset of the VMs, wherein the distributed file system is accessed by the distributed computing platform during processing of the data set;
providing the code for processing the data set to at least a subset of the VMs; and
initiating execution of the code for processing the data set on the at least subset of VMs to obtain data processing results for the user.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments perform automated provisioning of a cluster for a distributed computing platform. Target host computing devices are selected from a plurality of host computing devices based on configuration information, such as a desired cluster size, a data set, code for processing the data set and, optionally, a placement strategy. One or more virtual machines (VMs) are instantiated on each target host computing device. Each VM is configured to access a virtual disk that is preconfigured with code for executing functionality of the distributed computing platform and serves as a node of the cluster. The data set is stored in a distributed file system accessible by at least a subset of the VMs. The code for processing the data set is provided to at least a subset of the VMs, and execution of the code is initiated to obtain processing results.
-
Citations
20 Claims
-
1. A method for provisioning a cluster for a distributed computing platform, the method comprising:
-
receiving configuration information for a cluster from a user, wherein the configuration information comprises at least a cluster size, a data set and code for processing the data set from the user; selecting a plurality of target host computing devices from a plurality of host computing devices based on the configuration information; instantiating, based on the cluster size, at least one virtual machine (VM) on each of the target host computing devices to serve as a node of the cluster, wherein each instantiated VM is configured to access a virtual disk that is based on a VM template and preconfigured with code for executing functionality of the distributed computing platform; persistently storing the data set in a distributed file system accessible by at least a subset of the VMs, wherein the distributed file system is accessed by the distributed computing platform during processing of the data set; providing the code for processing the data set to at least a subset of the VMs; and initiating execution of the code for processing the data set on the at least subset of VMs to obtain data processing results for the user. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more computer-readable storage media including computer-executable instructions that, when executed by a processor, cause the processor to provision a distributed computing cluster having a plurality of virtual machines (VMs) by:
-
selecting a plurality of target host computing devices from a plurality of host computing devices based on configuration information including at least a cluster size, a data set and code for processing the data set; instantiating, based on the cluster size, at least one virtual machine (VM) on each of the target host computing devices to serve as a node of the cluster, wherein each instantiated VM is configured to access a virtual disk that is based on a VM template and preconfigured with code for executing functionality of a distributed computing platform; persistently storing the data set in a distributed file system accessible by at least a subset of the VMs, wherein the distributed file system is accessed by the distributed computing platform during processing of the data set; providing the code for processing the data set to at least a subset of the VMs; and initiating execution of the code for processing the data set on the at least subset of VMs to obtain data processing results. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system for provisioning a distributed computing cluster, the system comprising:
-
a plurality of host computing devices; and a management device coupled in communication with the host computing devices and configured to; select a plurality of target host computing devices from the plurality of host computing devices based on configuration information provided by a user, the configuration information including at least a cluster size, a data set and code for processing the data set from the user; instantiate, based on the cluster size, at least one virtual machine (VM) on each of the target host computing devices to serve as a node of the cluster, wherein each instantiated VM is configured to access a virtual disk that is based on a VM template and preconfigured with code for executing functionality of a distributed computing platform; persistently store the data set in a distributed file system accessible by at least a subset of the VMs, wherein the distributed file system is accessed by the distributed computing platform during processing of the data set; provide the code for processing the data set to at least a subset of the VMs; and initiate execution of the code for processing the data set on the at least subset of VMs to obtain data processing results for the user. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification