Return to search

Hadoop Based Data Intensive Computation on IAAS Cloud Platforms

Cloud computing is a relatively new form of computing which uses virtualized resources. It is dynamically scalable and is often provided as pay for use service over the Internet or Intranet or both. With increasing demand for data storage in the cloud, the study of data-intensive applications is becoming a primary focus. Data intensive applications are those which involve high CPU usage, processing large volumes of data typically in size of hundreds of gigabytes, terabytes or petabytes. The research in this thesis is focused on the Amazon’s Elastic Cloud Compute (EC2) and Amazon Elastic Map Reduce (EMR) using HiBench Hadoop Benchmark suite. HiBench is a Hadoop benchmark suite and is used for performing and evaluating Hadoop based data intensive computation on both these cloud platforms. Both quantitative and qualitative comparisons of Amazon EC2 and Amazon EMR are presented. Also presented are their pricing models and suggestions for future research.

Identiferoai:union.ndltd.org:unf.edu/oai:digitalcommons.unf.edu:etd-1597
Date01 January 2015
CreatorsVijayakumar, Sruthi
PublisherUNF Digital Commons
Source SetsUniversity of North Florida
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceUNF Theses and Dissertations

Page generated in 0.0016 seconds