Due to it being increasingly common that users' data is encrypted, the Internet service providers today find it difficult to adapt their service for the users' needs. Previously popular methods of classifying users data does not work as well today and new alternatives is therefore desired to give the users an optimal experience.This study focuses specifically on classifying data flows into video and non-video flows with the use of machine learning algorithms and with a focus on runtime performance. In this study the tested algorithms are created in Python and then exported into a C code implementation, more specifically the random forest and the gradient boosting trees algorithm.The goal is to find the algorithm with the fastest classification time relative to its accuracy, making the classification as fast as possible and the classification model to require as little space as possible.The results show that random forest was significantly faster at classification than gradient boosting trees, with initial tests showing it to be roughly 7 times faster after compiler optimization. After optimizing the C code random forest could classify more than 250,000 data flows each second with decent accuracy. Neither of the two algorithms required a lot of space (<3 megabyte). / HITS, 4707
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kau-56621 |
Date | January 2017 |
Creators | Västlund, Filip |
Publisher | Karlstads universitet, Fakulteten för hälsa, natur- och teknikvetenskap (from 2013) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf, application/pdf |
Rights | info:eu-repo/semantics/openAccess, info:eu-repo/semantics/openAccess |
Page generated in 0.0021 seconds