Return to search

Performance Comparison of Cassandra in LXC and Bare metal : Container Virtualization case study

Big data is a developing term that describes any large amount of structured and unstructured data that has the potential to be mined for information. To store this type of large amounts of data, cloud storage systems are necessary. These cloud storage systems are developed such that they are capable of keeping the data accessible and available to the users over a network. To store big data new platforms are required. Some of the popular big data platforms are Mongo, Cassandra and Hadoop. In this thesis we used Cassandra database system because it is a distributed database and also open source. Cassandra’s architecture is master less ring design that is easy to setup and easy to maintain. Apache Cassandra is a highly scalable distributed database designed to handle big data management with linear scalable and seamless multiple data center deployment. It is a NoSQL database system which allow schema free tables so that a data item could have a variable set of columns unlike in relational databases. Cassandra provides with high scalability with no single point of failure. For the past few years’ container based virtualization has been evolving rapidly. Container based virtualization such as LXC have been focused here. Linux Containers (LXC) is an operating system level virtualization method for running multiple isolated Linux systems on a single control host. It does not resemble a virtual machine, but provides a virtual environment that has its own CPU, memory, network, etc. space and the resource control mechanism. In this thesis work performance of Apache Cassandra database has been analyzed between bare metal and Linux Containers(LXC). A three node Cassandra cluster has been created on both bare metal and Linux container. Assuming one node as seed and Cassandra stress utility tool has been used to test the load of Cassandra cluster. The performance of Cassandra cluster database has been evaluated in bare metal and Linux Container which is the goal of this thesis work. Linux containers (LXC) are deployed in all the servers. A three node Cassandra database cluster has been created in these servers and also in Linux Container(LXC). Port forwarding is the technique used here for making communication between Cassandra in LXC which is the goal of this thesis work. The performance metrics which determine the performance of Cassandra cluster database are selected according to it. The network configuration parameters are changed according to the behavior of Cassandra. By doing changes in these parameters Cassandra starts running according to the required configuration, after this Cassandra cluster performance will be analyzed. This is done with different write, read and mixed load operations and compared with Cassandra cluster performance on bare metal. The results of the thesis show an analysis of measurements of performance metrics like CPU utilization, Disk throughput and latency while running on Cassandra cluster in both bare metal and Linux Containers. A quantitative and statistical analysis of performance of Cassandra cluster is compared. The physical resources utilized by the Cassandra database on native bare metal and Linux Containers (LXC) is similar. According to the results, CPU utilization is more for Cassandra database in Linux Containers. Disk throughput is also more in Linux Containers except in the case of 66% load write operation. Bare metal has less latency compared to Linux Containers in all the scenarios.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-13451
Date January 2016
CreatorsThiruvallur Vangeepuram, Reventh
PublisherBlekinge Tekniska Högskola, Institutionen för kommunikationssystem
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0023 seconds