Companies analyse large amounts of sensitive data on clusters of machines, using a framework such as Apache Hadoop to handle inter-process communication, and big data analytic tools such as Apache Spark and Apache Flink to analyse the growing amounts of data. Big data analytic tools are mainly tested on performance and reliability. Security and authentication have not been enough considered and they lack behind. The goal of this research is to improve the authentication and security for data analytic tools.Currently, the aforementioned big data analytic tools are using Kerberos for authentication. Kerberos has difficulties in providing multi factor authentication. Attacks on Kerberos can abuse the authentication. To improve the authentication, an analysis of the authentication in Hadoop and the data analytic tools is performed. The research describes the characteristics to gain an overview of the security of Hadoop and the data analytic tools. One characteristic is that the usage of the transport layer security (TLS) for the security of data transportation. TLS usually establishes connections with certificates. Recently, certificates with a short time to live can be automatically handed out.This thesis develops new authentication mechanism using certificates for data analytic tools on clusters of machines, providing advantages over Kerberos. To evaluate the possibility to replace Kerberos, the mechanism is implemented in Spark. As a result, the new implementation provides several improvements. The certificates used for authentication are made valid with a short time to live and are thus less vulnerable to abuse. Further, the authentication mechanism solves new requirements coming from businesses, such as providing multi-factor authenticationand scalability.In this research a new authentication mechanism is developed, implemented and evaluated, giving better data protection by providing improved authentication.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-215694 |
Date | January 2017 |
Creators | Velthuis, Paul |
Publisher | KTH, Skolan för informations- och kommunikationsteknik (ICT) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | TRITA-ICT-EX ; 2017:163 |
Page generated in 0.0023 seconds