Return to search

A dynamic approach to sorting with respect to big data

This study introduces a dynamic approach to sorting, making use of predictions and data gathered during run-time to optimize the sorting of the current data set. This approach is used to develop a sorting algorithm called DynamicSort which partitions data and calculates a partial standard deviation for each partition to determine which of two sorting algorithms should be used to sort the partition. The algorithm is tested against Quicksort and radix sort on data sets of different sizes and standard deviation with the intent of finding advantages of the approach. In order to adapt to modern applications, the algorithm is tested in an environment utilizing parallel processing on multiple machines on data sets generated to mimic the characteristic size of big data. To accommodate this the data is divided at start and merged together after sorting using a k-way merge sort. While the tests conducted do not show any concrete gain in performance there are several factors that could be further optimized and evaluated. We find that it is not enough to simply consider the standard deviation in this approach. While no real instance of big data was used the algorithm was adapted for limited cache sizes and multiple hosts working in parallel.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-212444
Date January 2023
CreatorsAlmström, Filip
PublisherUmeå universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUMNAD ; 1432

Page generated in 0.0016 seconds