Global ETD Search

1	Static Detection of Malware in Portable Executables / Statisk spårning av skadlig kod i Portable Executables filer Paananen, Josefin January 2021 (has links) The first detected computer virus commenced in the 1970s. Since then, malware infections have grown exponentially along with rapid increases within the digital environment. Malware detection is a challenging task due to the relentless growth in complexity and volume. That is why the need for automated detection arises. Applying machine learning to malware detection is not a new trend, and researchers have been experimenting with since the 1990s. This thesis aims to evaluate classification algorithms to discover malicious Portable Executables by looking at their static features. Six machine learning models were built and tested based on 20,000 malicious and benign files. Random Forest scored the highest cross-validation score of 99.3% amongst the models with 15 features. Selecting the number of features was based on research of previous studies. This thesis confirms that it is possible to use machine learning for static malware detection. It can also help for future automated malware analysis research. / Det första datorviruset upptäcktes på 1970-talet. Sedan dess, har antalet attacker ökat i och med den skenande digitala utvecklingen. Att finna skadlig kod är en utmanade uppgift då de ökar i komplexitet och volym. Därför finns det ett behov att automatisera spårningen. Att använda maskininlärning för upptäckt av skadlig kod är inte en ny trend och forskare har experimenterat med det sedan år 1990. Syftet med denna avhandling är att utvärdera klassificeringsalgortimer för att upptäckta skadlig kod i Portable Executables genom att använda statiska prediktorer. Sex stycken maskininlärnings modeller skapades och testades baserat på 20.000 skadliga och legitima filer. Random Forest uppnådde det högsta korsvalderingsvärdet på 99.3% av dessa modeller med 15 prediktorer. Att använda 15 prediktorer var inspirerat av forskning av tidigare studier. Denna avhandling bevisar att det är möjligt att använda maskininlärning för statisk spårning av skadlig kod. Det kan också användas för framtida automatiserade forskningsstudier om skadlig kod. malware detection machine learning portable executables static malware analysis Social Sciences Interdisciplinary
2	Feature selection and clustering for malicious and benign software characterization Chhabra, Dalbir Kaur R 13 August 2014 (has links) Malware or malicious code is design to gather sensitive information without knowledge or permission of the users or damage files in the computer system. As the use of computer systems and Internet is increasing, the threat of malware is also growing. Moreover, the increase in data is raising difficulties to identify if the executables are malicious or benign. Hence, we have devised a method that collects features from portable executable file format using static malware analysis technique. We have also optimized the important or useful features by either normalizing or giving weightage to the feature. Furthermore, we have compared accuracy of various unsupervised learning algorithms for clustering huge dataset of samples. So once the clusters are created we can use antivirus (AV) to identify one or two file and if they are detected by AV then all the files in cluster are malicious even if the files contain novel or unknown malware; otherwise all are benign. Static malware analysis Portable Executable unsupervised learning algorithm malicious or benign samples feature selection clustering Information Security
3	An Evaluation of Machine Learning Approaches for Hierarchical Malware Classification Roth, Robin, Lundblad, Martin January 2019 (has links) With an evermore growing threat of new malware that keeps growing in both number and complexity, the necessity for improvement in automatic detection and classification of malware is increasing. The signature-based approaches used by several Anti-Virus companies struggle with the increasing amount of polymorphic malware. The polymorphic malware change some minor aspects of the code to be able to remain undetected. Malware classification using machine learning have been used to try to solve this issue in previous research. In the proposed work, different hierarchical machine learning approaches are implemented to conduct three experiments. The methods utilise a hierarchical structure in various ways to be able to get a better classification performance. A selection of hierarchical levels and machine learning models are used in the experiments to evaluate how the results are affected. A data set is created, containing over 90000 different labelled malware samples. The proposed work also includes the creation of a labelling method that can be helpful for researchers in malware classification that needs labels for a created data set.The feature vector used contains 500 n-gram features and 3521 Import Address Table features. In the experiments for the proposed work, the thesis includes the testing of four machine learning models and three different amount of hierarchical levels. Stratified 5-fold cross validation is used in the proposed work to reduce bias and variance in the results. The results from the classification approach shows it achieves the highest hF-score, using Random Forest (RF) as the machine learning model and having four hierarchical levels, which got an hF-score of 0.858228. To be able to compare the proposed work with other related work, pure-flat classification accuracy was generated. The highest generated accuracy score was 0.8512816, which was not the highest compared to other related work. Machine Learning Hierarchical Malware Classification Static Malware Analysis Mnemonic N-grams Other Computer and Information Science Annan data- och informationsvetenskap

1

Page generated in 0.0862 seconds