Return to search

A fault-tolerant mechanism for desktop cloud systems

Cloud computing is a paradigm that promises to move IT another step towards the age of computing utility. Traditionally, Clouds employ dedicated resources located in data centres to provide services to clients. The resources in such Cloud systems are known to be highly reliable with a low probability of failure. Desktop Cloud computing is a new type of Cloud computing that aims to provide Cloud services at little or no cost. This ambition can be achieved by combining Cloud computing and Volunteer computing into Desktop Clouds, harnessing non-dedicated resources when idle. The resources can be any type of computing machine, for example a standard PC, but such computing resources are renowned for their volatility; failures can happen at any time without warning. In Cloud computing, tasks are submitted by Cloud users or brokers to be processed and executed by virtual machines (VMs), and virtual mechanisms are hosted by physical machines (PMs). In this context, throughput is defined as the proportion of the total number of tasks that are successfully processed, so the failure of a PM can have a negative impact on this measure of a Desktop Cloud system by causing the destruction of all hosted VMs, leading to the loss of submitted tasks currently being processed. The aim of this research is to design a VM allocation mechanism for Desktop Cloud systems that is tolerant to node failure. VM allocation mechanisms are responsible for allocating VMs to PMs and migrating them during runtime with the objective of optimisation, yet those available pay little attention to node failure events. The contribution of this research is to propose a Fault-Tolerant VM allocation mechanism that handles failure events in PMs in Desktop Clouds to ensure that the throughput of Desktop Cloud system remains within acceptable levels by employing a replication technique. Since doing so causes an increase of power consumption in PMs, the mechanism is enhanced with a migration policy to minimise this effect, evaluated using three metrics: throughput of tasks; power consumption of PMs; and service availability. The evaluation is conducted using DesktopCloudSim, a tool developed for the purpose by this study as an extension to CloudSim, the well-known Cloud simulation tool, to simulate node failure events in Cloud systems, analysing node failure with real data sets of collected from Failure Trace Archives. The experiments demonstrate that the FT mechanism improves the throughput of Cloud systems statistically significantly compared with traditional mechanisms (First Come First Serve, Greedy and RoundRobin) in the presence of node failures. The FT mechanism reduces power consumption statistically significantly when its migration policy is employed.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:680703
Date January 2015
CreatorsAlwabel, Abdulelah
ContributorsWills, Gary
PublisherUniversity of Southampton
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://eprints.soton.ac.uk/387007/

Page generated in 0.0015 seconds