E-commerce has seen a rapid growth the last two decades, making it easy for customers to shop wherever they are. The growth has also led to new kinds of fraudulent activities affecting the customers. To make customers feel safe while shopping online, companies like Resurs Bank are implementing different kinds of fraud filters to freeze transactions that are thought to be fraudulent. The latest type of fraud filter is based on machine learning. While this seems to be a promising technology, data and algorithms need to be tuned properly to the task at hand. This thesis project gives a proof of concept of realizing a machine learning based fraud filter for Resurs Bank. Based on a literature study, available data and explainability requirements, this work opts for a supervised learning approach based on Random Forests with a sliding window to overcome concept drift. The inherent class imbalance of the setting makes the area-under-the-receiver operating-curve a suitable metric. This approach provided promising results that a machine learning based fraud filter can add value to companies like Resurs Bank. An alternative approach on how to incorporate non-numerical features by using recurrent neural networks (RNN) was implemented and compared. The non-numerical feature was transformed by a pre-trained RNN-model to a numerical representation that reflects the features suspiciousness. This new numerical feature was then included in the Random Forest model and the result demonstrated that this approach can add valuable insight to the fraud detection field.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-424464 |
Date | January 2020 |
Creators | Andrée, Anton |
Publisher | Uppsala universitet, Avdelningen för systemteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UPTEC STS, 1650-8319 ; 20035 |
Page generated in 0.0018 seconds