Thesis (MTech (Business Information Systems))--Cape Peninsula University of Technology, 2016. / The emergence of big data (BD) has rendered existing conventional business intelligence (BI) tools inefficient and ineffective for real-time decision support systems (DSS). The inefficiency and ineffectiveness is perceived when business users need to make decisions based on stale and sometimes, incomplete data sets, which potentially leads to slow and poor decision making. In recent years, industry and academia have invented new technologies to process BD such as Hadoop, Spark, in-memory databases and NOSQL databases. The appearance of these new technologies have escalated to an extent, that organisations are faced with the challenge of determining most suitable technologies that are appropriate for real-time DSS requirements. Due to BD still being a new concept, there are no standard guidelines or frameworks available to assist in the evaluation and comparing of BD technologies. This research aims to explore factors that influence the selection of technologies appropriate for real-time DSSs in a BD environment. In addition, it further proposes evaluation criteria that can be used to compare and select these technologies. To achieve this aim, a literature analysis to understand the concept of BD, real-time DSSs and related technologies is conducted. Qualitative as well as quantitative research techniques are used after interviews are conducted with BI experts who have BD knowledge and experience. Experimental research in a computer laboratory is also conducted. The purpose of the interviews is to ascertain which technologies are being used for BD analytics and in addition, which evaluation criteria organisations use when choosing such a technology. Furthermore, a comparative computer laboratory experiment is conducted to compare three tools which run on Hadoop namely; Hive, Impala and Spark. The purpose of the experiment is to test if system performance is different for the three tools when analysing the same data set and the same computer resources. The impirical results reveals nine main factors which impact the selection of technologies appropriate for real-time DSS in a BD environment, and ten application independent evaluation criteria. Furthermore, the experiment results indicate that system performance in terms of latency, is significantly different among the three tools compared.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:cput/oai:localhost:20.500.11838/2350 |
Date | January 2016 |
Creators | Muchemwa, Regis Fadzi |
Contributors | de la Harpe, Andre |
Publisher | Cape Peninsula University of Technology |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Thesis |
Rights | http://creativecommons.org/licenses/by-nc-sa/3.0/za/ |
Page generated in 0.0021 seconds