Return to search

Object serialization vs relational data modelling in Apache Cassandra: a performance evaluation

Context. In newer database solutions designed for large-scale, cloud-based services, database performance is of particular concern as these services face scalability challenges due to I/O bottlenecks. These issues can be alleviated through various data model optimizations that reduce I/O loads. Object serialization is one such approach. Objectives. This study investigates the performance of serialization using the Apache Avro library in the Cassandra database. Two different serialized data models are compared with a traditional relational database model. Methods. This study uses an experimental approach that compares read and write latency using Twitter data in JSON format. Results. Avro serialization is found to improve performance. However, the extent of the performance benefit is found to be highly dependent on the serialization granularity defined by the data model. Conclusions. The study concludes that developers seeking to improve database throughput in Cassandra through serialization should prioritize data model optimization as serialization by itself will not outperform relational modelling in all use cases. The study also recommends that further work is done to investigate additional use cases, as there are potential performance issues with serialization that are not covered in this study.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-10391
Date January 2015
CreatorsJohansen, Valdemar
PublisherBlekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0112 seconds