1 |
Developing Random Compaction Strategy for Apache Cassandra database and Evaluating performance of the strategySurampudi, Roop Sai January 2021 (has links)
Introduction: Nowadays, the data generated by global communication systems is enormously increasing. There is a need by Telecommunication Industries to monitor and manage this data generation efficiently. Apache Cassandra is a NoSQL database that manages any formatted data and a massive amount of data flow efficiently. Aim: This project is focused on developing a new random compaction strategy and evaluating this random compaction strategy's performance. In this study, limitations of generic compaction strategies Size Tiered Compaction Strategy and Leveled Compaction Strategy will be investigated. A new random compaction strategy will be developed to address the limitations of the generic Compaction Strategies. Important performance metrics required for the evaluation of the strategy will be studied. Method: In this study, a grey literature review is done to understand the working of Apache Cassandra, different compaction strategies' APIs. A random compaction strategy is developed in two phases of development. A testing environment is created consisting of a 4-node cluster and a simulator. Evaluated the performance by stress-testing the cluster using different workloads. Conclusions: A stable RCS artifact is developed. This artifact also includes the support of generating random threshold from any user-defined distribution. Currently, only Uniform, Geometric, and Poisson distributions are supported. The RCS-Uniform's performance is found to be better than both STCS and LCS. The RCS-Poisson's performance is found to be not better than both STCS and LCS. The RCS-Geometric's performance is found to be better than STCS.
|
Page generated in 0.1682 seconds