MULTI-TENANT PRODUCTION AND TEST DEPLOYMENTS OF HADOOP
First Claim
1. A method for executing a distributed computing application within a virtualized computing environment for a plurality of tenants, the method comprising:
- instantiating a first plurality of virtual machines (VMs) on a plurality of hosts to form a first distributed filesystem, wherein at least one VM of the first plurality of VMs includes a virtual disk configured to store data blocks;
storing an input data set in the first distributed filesystem, wherein the input data set comprises a plurality of data blocks, wherein the first distributed filesystem is accessible by a plurality of compute VMs configured to process the input data set; and
instantiating a second plurality of VMs on the plurality of hosts to form a second distributed filesystem storing the same input data set, wherein each instantiated VM of the second plurality of VMs comprises a linked clone that references a virtual disk of a corresponding VM in the first plurality of VMs.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed computing application is described that provides a highly elastic and multi-tenant platform for Hadoop applications and other workloads running in a virtualized environment. Production, test, and development deployments of a Hadoop application may be executed using multiple compute clusters and a shared instance of a distributed filesystem, or in other cases, multiple instances of the distributed filesystem. Data nodes executing as virtual machines (VMs) for test and development deployments can be linked clones of data nodes executing as VMs for a production deployment to reduce duplicated data and provide a shared storage space.
-
Citations
20 Claims
-
1. A method for executing a distributed computing application within a virtualized computing environment for a plurality of tenants, the method comprising:
-
instantiating a first plurality of virtual machines (VMs) on a plurality of hosts to form a first distributed filesystem, wherein at least one VM of the first plurality of VMs includes a virtual disk configured to store data blocks; storing an input data set in the first distributed filesystem, wherein the input data set comprises a plurality of data blocks, wherein the first distributed filesystem is accessible by a plurality of compute VMs configured to process the input data set; and instantiating a second plurality of VMs on the plurality of hosts to form a second distributed filesystem storing the same input data set, wherein each instantiated VM of the second plurality of VMs comprises a linked clone that references a virtual disk of a corresponding VM in the first plurality of VMs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer-readable storage medium comprising instructions that, when executed in a computing device, execute a distributed computing application within a virtualized computing environment for a plurality of tenants, by performing the steps of:
-
instantiating a first plurality of virtual machines (VMs) on a plurality of hosts to form a first distributed filesystem, wherein at least one VM of the first plurality of VMs includes a virtual disk configured to store data blocks; storing an input data set in the first distributed filesystem, wherein the input data set comprises a plurality of data blocks, wherein the first distributed filesystem is accessible by a plurality of compute VMs configured to process the input data set; and instantiating a second plurality of VMs on the plurality of hosts to form a second distributed filesystem storing the same input data set, wherein each instantiated VM of the second plurality of VMs comprises a linked clone that references a virtual disk of a corresponding VM in the first plurality of VMs. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer system having a plurality of hosts executing a plurality of virtual machines (VMs) for executing a distributed computing application within a virtualized computing environment for a plurality of tenants, the computer system comprising:
-
a memory; and a processor programmed to carry out the steps of; instantiating a first plurality of virtual machines (VMs) on a plurality of hosts to form a first distributed filesystem, wherein at least one VM of the first plurality of VMs includes a virtual disk configured to store data blocks; storing an input data set in the first distributed filesystem, wherein the input data set comprises a plurality of data blocks, wherein the first distributed filesystem is accessible by a plurality of compute VMs configured to process the input data set; and instantiating a second plurality of VMs on the plurality of hosts to form a second distributed filesystem storing the same input data set, wherein each instantiated VM of the second plurality of VMs comprises a linked clone that references a virtual disk of a corresponding VM in the first plurality of VMs. - View Dependent Claims (19, 20)
-
Specification