Global ETD Search

1	CloudIntell: An intelligent malware detection system Mirza, Qublai K.A., Awan, Irfan U., Younas, M. 25 July 2017 (has links) Yes / Enterprises and individual users heavily rely on the abilities of antiviruses and other security mechanisms. However, the methodologies used by such software are not enough to detect and prevent most of the malicious activities and also consume a huge amount of resources of the host machine for their regular oper- ations. In this paper, we propose a combination of machine learning techniques applied on a rich set of features extracted from a large dataset of benign and malicious les through a bespoke feature extraction tool. We extracted a rich set of features from each le and applied support vector machine, decision tree, and boosting on decision tree to get the highest possible detection rate. We also introduce a cloud-based scalable architecture hosted on Amazon web services to cater the needs of detection methodology. We tested our methodology against di erent scenarios and generated high achieving results with lowest energy con- sumption of the host machine.
2	Predicting profitability of new customers using gradient boosting tree models : Evaluating the predictive capabilities of the XGBoost, LightGBM and CatBoost algorithms Kinnander, Mathias January 2020 (has links) In the context of providing credit online to customers in retail shops, the provider must perform risk assessments quickly and often based on scarce historical data. This can be achieved by automating the process with Machine Learning algorithms. Gradient Boosting Tree algorithms have demonstrated to be capable in a wide range of application scenarios. However, they are yet to be implemented for predicting the profitability of new customers based solely on the customers’ first purchases. This study aims to evaluate the predictive performance of the XGBoost, LightGBM, and CatBoost algorithms in this context. The Recall and Precision metrics were used as the basis for assessing the models’ performance. The experiment implemented for this study shows that the model displays similar capabilities while also being biased towards the majority class. Gradient tree boosting XGBoost LightGBM CatBoost prediction profitability online retail Information Systems, Social aspects
3	A cloud-based intelligent and energy efficient malware detection framework : a framework for cloud-based, energy efficient, and reliable malware detection in real-time based on training SVM, decision tree, and boosting using specified heuristics anomalies of portable executable files Mirza, Qublai K. A. January 2017 (has links) The continuity in the financial and other related losses due to cyber-attacks prove the substantial growth of malware and their lethal proliferation techniques. Every successful malware attack highlights the weaknesses in the defence mechanisms responsible for securing the targeted computer or a network. The recent cyber-attacks reveal the presence of sophistication and intelligence in malware behaviour having the ability to conceal their code and operate within the system autonomously. The conventional detection mechanisms not only possess the scarcity in malware detection capabilities, they consume a large amount of resources while scanning for malicious entities in the system. Many recent reports have highlighted this issue along with the challenges faced by the alternate solutions and studies conducted in the same area. There is an unprecedented need of a resilient and autonomous solution that takes proactive approach against modern malware with stealth behaviour. This thesis proposes a multi-aspect solution comprising of an intelligent malware detection framework and an energy efficient hosting model. The malware detection framework is a combination of conventional and novel malware detection techniques. The proposed framework incorporates comprehensive feature heuristics of files generated by a bespoke static feature extraction tool. These comprehensive heuristics are used to train the machine learning algorithms; Support Vector Machine, Decision Tree, and Boosting to differentiate between clean and malicious files. Both these techniques; feature heuristics and machine learning are combined to form a two-factor detection mechanism. This thesis also presents a cloud-based energy efficient and scalable hosting model, which combines multiple infrastructure components of Amazon Web Services to host the malware detection framework. This hosting model presents a client-server architecture, where client is a lightweight service running on the host machine and server is based on the cloud. The proposed framework and the hosting model were evaluated individually and combined by specifically designed experiments using separate repositories of clean and malicious files. The experiments were designed to evaluate the malware detection capabilities and energy efficiency while operating within a system. The proposed malware detection framework and the hosting model showed significant improvement in malware detection while consuming quite low CPU resources during the operation.
4	A Comparative Study of Machine Learning Algorithms Le Fort, Eric January 2018 (has links) The selection of machine learning algorithm used to solve a problem is an important choice. This paper outlines research measuring three performance metrics for eight different algorithms on a prediction task involving under- graduate admissions data. The algorithms that were tested are k-nearest neighbours, decision trees, random forests, gradient tree boosting, logistic regression, naive bayes, support vector machines, and artificial neural net- works. These algorithms were compared in terms of accuracy, training time, and execution time. / Thesis / Master of Applied Science (MASc) Machine Learning Comparative Study Data Science University Admissions Software Engineering Computer Science K-Nearest Neighbours Decision Tree Random Forest Gradient Tree Boosting Logistic Regression Naive Bayes Support Vector Machine Neural Network
5	How Certain Are You of Getting a Parking Space? : A deep learning approach to parking availability prediction / Maskininlärning för prognos av tillgängliga parkeringsplatser Nilsson, Mathias, von Corswant, Sophie January 2020 (has links) Traffic congestion is a severe problem in urban areas and it leads to the emission of greenhouse gases and air pollution. In general, drivers lack knowledge of the location and availability of free parking spaces in urban cities. This leads to people driving around searching for parking places, and about one-third of traffic congestion in cities is due to drivers searching for an available parking lot. In recent years, various solutions to provide parking information ahead have been proposed. The vast majority of these solutions have been applied in large cities, such as Beijing and San Francisco. This thesis has been conducted in collaboration with Knowit and Dukaten to predict parking occupancy in car parks one hour ahead in the relatively small city of Linköping. To make the predictions, this study has investigated the possibility to use long short-term memory and gradient boosting regression trees, trained on historical parking data. To enhance decision making, the predictive uncertainty was estimated using the novel approach Monte Carlo dropout for the former, and quantile regression for the latter. This study reveals that both of the models can predict parking occupancy ahead of time and they are found to excel in different contexts. The inclusion of exogenous features can improve prediction quality. More specifically, we found that incorporating hour of the day improved the models’ performances, while weather features did not contribute much. As for uncertainty, the employed method Monte Carlo dropout was shown to be sensitive to parameter tuning to obtain good uncertainty estimates. monte carlo dropout mc dropout long short term memory lstm neural network recurrent neural network gradient tree boosting rnn gradient boosting regression tree gbrt quantile regression traffic congestion parking parking occupancy parking availability parking space parking lot machine learning Other Computer and Information Science Annan data- och informationsvetenskap
6	Money Laundering Detection using Tree Boosting and Graph Learning Algorithms / Detektion av Penningtvätt med hjälp av Trädalgoritmer och Grafinlärningsalgoritmer Frumerie, Rickard January 2021 (has links) In this masters thesis we focused on using machine learning methods for detecting money laundering in financial transaction networks, in order to demonstrate that it can be used as a complement or instead of the more commonly used rule based systems. The graph learning method graph convolutional networks (GCN) has been a hot topic in the field since they were shown to scale well with data size back in 2018. However the typical GCN models cannot use edge features, which is why this thesis combines the GCN model with a node and edge neural network (NENN) in order to solve this problem. This new method will be compared towards an already established machine learning method for financial transactions, namely the tree boosting method (XGBoost). Because of confidentiality concerns for financial transactions data, the machine learning algorithms will be tested on two carefully constructed synthetically generated data sets, which from agent based simulations resembles real financial data. The results showed the viability and superiority of the new implementation of the GCN model with it being a preferable method for connectivly structured data, meaning that a transaction or account is analyzed in the context of its financial environment. On the other hand the XGBoost method showed better results when examining transactions independently. Hence it was more accurately able to find fraudulent and non fraudulent patterns from the transactional features themselves. / I detta examensarbete fokuserar vi på användandet av maskininlärningsmetoder för att detektera penningtvätt i finansiella transaktionsnätverk, med målet att demonstrera att dess kan användas som ett komplement till eller i stället för de mer vanligt använda regelbaserade systemen. Grafinlärningsmetoden \textit{graph convolutional networks} (GCN) som har varit ett hett ämne inom området sedan metoden under 2018 visades fungera bra för stora datamängder. Däremot kan inte en vanlig GCN-modell använda kantinformation, vilket är varför denna avhandling kombinerar GCN-modellen med \textit{node and edge neural networks} (NENN) för att mer effektivt detektera penningtvätt. Denna nya metod kommer att jämföras med en redan etablerad maskininlärningsmetod för finansiella transaktioner, nämligen \textit{tree boosting} (XGBoost). På grund av sekretessanledningar för finansiella transaktionsdata var maskininlärningsalgoritmerna testade på två noggrant konstruerade syntetiskt genererade datamängder som från agentbaserade simuleringar liknar riktiga finansiella data. Resultaten visade på applikationsmöjligheter och överlägsenhet för den nya implementationen av GCN-modellen vilken är att föredra för relationsstrukturerade data, det vill säga när transaktioner och konton analyseras i kontexten av deras finansiella omgivning. Å andra sidan visar XGBoost bättre resultat på att examinera transaktioner individuellt eftersom denna metod mer precist kan identifiera bedrägliga och icke-bedrägliga mönster från de transnationella funktionerna. Tree boosting XGBoost graph convolutional networks (GCN) node and edge neural networks (NENN) exploratory data analysis (EDA) anti money laundering (AML) financial graph networks. Trädalgoritmer XGBoost convolutions grafnätverk (GCN) nod och kant neurala nätverk (NENN) utforskande dataanalys penningtvättsbekämpning (AML) finansiella grafnätverk. Probability Theory and Statistics Sannolikhetsteori och statistik

1

Page generated in 0.0788 seconds