Large-Scale Elastic Computing with Virtual Machines

Marshall, Paul D.

Graduate Thesis Or Dissertation

Large-Scale Elastic Computing with Virtual Machines Public Deposited

Analytics

Download PDF

Citations

Citeable URL: https://scholar.colorado.edu/concern/graduate_thesis_or_dissertations/5h73pw27k

Abstract

Computational resources experience dynamic load because demand is not constant. As a result, resource providers (RPs) must estimate the appropriate amount of resources to purchase in order to best meet variable demand, possibly resulting in under-utilized resources during periods of low demand and over-utilized resources during periods of high demand. With the relatively recent introduction of infrastructure-as-a-service (IaaS) clouds, which lease virtual machines (VMs) on-demand, RPs can both deploy private clouds (offering users a new type of resource lease) and outsource appropriate workload processing to external clouds when needed. To match resource deployments with demand, RPs can create elastic environments that span private and public clouds, expanding as demand increases and shrinking as demand decreases. However, elastic environments do not provide the necessary mechanisms and techniques required to extend cluster resources automatically and efficiently for scientific workflows. Instead, they must be managed manually, which is inefficient and limits scalability, or use product-specific solutions that are not open or extensible. Furthermore, IaaS toolkits typically provide high-cost, on-demand leases that are not required by all workflow paradigms. This dissertation presents a flexible cloud architecture and its implementation, consisting of preemptible and preset leases and an elastic environment that is capable of outsourcing cluster demand to IaaS clouds. The architecture allows RPs to adapt efficiently and cost effectively to variable demand for common scientific workflow patterns. Preemptible and preset leases are a new low-cost lease for IaaS clouds that are amenable for volunteer computing or high-throughput computing workloads. For implementation, these leases are included in the open source Nimbus IaaS toolkit and deploy preemptible VMs on idle resources, allowing RPs to increase utilization of under-utilized IaaS clouds. To adjust to variable demand, the elastic environment uses resource provisioning policies that provision and relinquish IaaS instances, outsourcing to external clouds when demand is high. The resource provisioning policies balance conflicting objectives between users and administrators, such as minimizing job queued time and the cost of the deployment. For evaluation, a complete end-to-end elastic environment is developed and used to process a large bioinformatics workload across multiple clouds.

Creator