The aim of this thesis is to show how to design and implement an application for network traces analysis using Apache Spark distributed system. Implementation can be divided into three parts - loading data from a distributed HDFS storage, supported network protocols analysis and distributed data processing. As a data visualization tool is used web-based notebook Apache Zeppelin. The resulting application is able to analyze individual packets as well as the entire flows. It supports JSON and pcap as input data formats. The goal of the application is to allow Big Data processing. The greatest impact on its performance has the input data format and allocation of the available cores.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:385893 |
Date | January 2018 |
Creators | Béder, Michal |
Contributors | Veselý, Vladimír, Ryšavý, Ondřej |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.1208 seconds