Return to search

Skräppost eller skinka? : En jämförande studie av övervakade maskininlärningsalgoritmer för spam och ham e-mailklassifikation / Spam or ham? : A comparative study of monitored machine learning algorithms for spam and ham e-mail classification.

Spam messages in the form of e-mail is a growing problem in today's businesses. It is a problem that costs time and resources to counteract. Research into this has been done to produce techniques and tools aimed at addressing the growing number on incoming spam e-mails. The research on different algorithms and their ability to classify e-mail messages needs an update since both tools and spam e-mails have become more advanced. In this study, three different machine learning algorithms have been evaluated based on their ability to correctly classify e-mails as legitimate or spam. These algorithms are naive Bayes, support vector machine and decision tree. The algorithms are tested in an experiment with the Enron spam dataset and are then compared against each other in their performance. The result of the experiment was that support vector machine is the algorithm that correctly classified most of the data points. Even though support vector machine has the largest percentage of correctly classified data points, other algorithms can be useful from a business perspective depending on the task and context.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-389384
Date January 2019
CreatorsBergens, Simon, Frykengård, Pontus
PublisherUppsala universitet, Institutionen för informatik och media, Högskolan på Gotland, Avdelningen för Programvaruteknik, Uppsala universitet, Institutionen för informatik och media, Högskolan på Gotland, Avdelningen för Programvaruteknik
Source SetsDiVA Archive at Upsalla University
LanguageSwedish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds