Spelling suggestions: "subject:"fraud detection"" "subject:"braud detection""
31 |
Detecting fraudulent users using behaviour analysis / Detektera artificiella användare med hjälp av beteendeanalysJóhannsson, Jökull January 2017 (has links)
With the increased global use of online media platforms, there are more opportunities than ever to misuse those platforms or perpetrate fraud. One such fraud is within the music industry, where perpetrators create automated programs, streaming songs to generate revenue or increase popularity of an artist. With growing annual revenue of the digital music industry, there are significant financial incentives for perpetrators with fraud in mind. The focus of the study is extracting user behavioral patterns and utilising them to train and compare multiple supervised classification method to detect fraud. The machine learning algorithms examined are Logistic Regression, Support Vector Machines, Random Forest and Artificial Neural Networks. The study compares performance of these algorithms trained on imbalanced datasets carrying different fractions of fraud. The trained models are evaluated using the Precision Recall Area Under the Curve (PR AUC) and a F1-score. Results show that the algorithms achieve similar performance when trained on balanced and imbalanced datasets. It also shows that Random Forest outperforms the other methods for all datasets tested in this experiment. / Med den ökande användningen av strömmande media ökar också möjligheterna till missbruk av dessa platformar samt bedrägeri. Ett typiskt fall av bedrägeri är att använda automatiserade program för att strömma media, och därigenom generera intäkter samt att öka en artist popularitet. Med den växande ekonomin kring strömmande media växer också incitamentet till bedrägeriförsök. Denna studies fokus är att finna användarmönster och använda denna kunskap för att träna modeller som kan upptäcka bedrägeriförsök. The maskininlärningsalgoritmer som undersökts är Logistic Regression, Support Vector Machines, Random Forest och Artificiella Neurala Nätverk. Denna studie jämför effektiviteten och precisionen av dessa algoritmer, som tränats på obalanserad data som innehåller olika procentandelar av bedrägeriförsök. Modellerna som genererats av de olika algoritmerna har sedan utvärderas med hjälp av Precision Recall Area Under the Curve (PR AUC) och F1-score. Resultaten av studien visar på liknande prestanda mellan modellerna som genererats av de utvärderade algoritmerna. Detta gäller både när de tränats på balanserad såväl som obalanserad data. Resultaten visar också att Random Forestbaserade modeller genererar bättre resultat för alla dataset som testats i detta experiment.
|
32 |
Detecting Fraudulent User Behaviour : A Study of User Behaviour and Machine Learning in Fraud DetectionGerdelius, Patrik, Hugo, Sjönneby January 2024 (has links)
This study aims to create a Machine Learning model and investigate its performance of detecting fraudulent user behaviour on an e-commerce platform. The user data was analysed to identify and extract critical features distinguishing regular users from fraudulent users. Two different types of user data were used; Event Data and Screen Data, spanning over four weeks. A Principal Component Analysis (PCA) was applied to the Screen Data to reduce its dimensionality. Feature Engineering was conducted on both Event Data and Screen Data. A Random Forest model, a supervised ensemble method, was used for classification. The data was imbalanced due to a significant difference in number of frauds compared to regular users. Therefore, two different balancing methods were used: Oversampling (SMOTE) and changing the Probability Threshold (PT) for the classification model. The best result was achieved with the resampled data where the threshold was set to 0,4. The result of this model was a prediction of 80,88% of actual frauds being predicted as such, while 0,73% of the regular users were falsely predicted as frauds. While this result was promising, questions are raised regarding the validity since there is a possibility that the model was over-fitted on the data set. An indication of this was that the result was significantly less accurate without resampling. However, the overall conclusion from the result was that this study shows an indication that it is possible to distinguish frauds from regular users, with or without resampling. For future research, it would be interesting to see data over a more extended period of time and train the model on real-time data to counter changes in fraudulent behaviour.
|
33 |
Indicators of Fraud Detection Proficiency and Their Impact on Auditor Judgments in Fraud Risk Assessments and Audit Plan ModificationsEnget, Kathryn Ann 21 July 2015 (has links)
The study examines how an individual's level of fraud detection proficiency (an individual possessing formal fraud education or training, informal fraud training, fraud task-specific experience, and /or fraud-related certifications) impacts their performance on fraud risk assessments and modification of audit plans. Further, it explores which of the fraud detection proficiency dimensions are valuable for auditors in situations of high and low levels of fraud risk and how these characteristics interact with professional skepticism. This, as well as the effectiveness and efficiency of the procedures selected, are addressed using a survey-based scenario where one case is embedded with a financial statement fraud and the other is not. Tobit and ordered logit regression models are used to evaluate a sample of 40 auditors and 10 forensic professionals with varying levels of fraud-related experiences, education, training, and certifications against a benchmark panel.
Results demonstrate fraud certifications are effective in fraud risk assessments, are not effective in audit plan modifications, and on average those individuals tend to over-audit. In addition, fraud-related task-specific experience improves audit plan modification effectiveness. Third, including professional skepticism as an interaction is more reflective of the variable's nature, with results supporting interactions with fraud certifications and informal fraud training in the fraud risk assessment model and formal fraud training in the audit plan modifications model. Finally, individuals of higher rank, in addition to those with fraud certifications, are more likely to over-audit, while individuals in the no fraud scenario are more likely to under-audit. This study contributes to the academic literature with regard to a subset of the FJDM proposed by Hammersley (2011) validating professional skepticism as an integral variable in the model, particularly as an interaction variable and with regard to the impacts of fraud certifications and fraud-related task-specific experience. The study also contributes by providing evidence, which indicate lower fraud risk situations are prone to assessing fraud risk less effectively and under-auditing. Finally, this study also contributes a new measure for direct fraud-related experience, which captures more details regarding applicable task-specific experiences. / Ph. D.
|
34 |
RESONANT: Reinforcement Learning Based Moving Target Defense for Detecting Credit Card FraudAbdel Messih, George Ibrahim 20 December 2023 (has links)
According to security.org, as of 2023, 65% of credit card (CC) users in the US have been subjected to fraud at some point in their lives, which equates to about 151 million Americans. The proliferation of advanced machine learning (ML) algorithms has also contributed to detecting credit card fraud (CCF). However, using a single or static ML-based defense model against a constantly evolving adversary takes its structural advantage, which enables the adversary to reverse engineer the defense's strategy over the rounds of an iterated game. This paper proposes an adaptive moving target defense (MTD) approach based on deep reinforcement learning (DRL), termed RESONANT to identify the optimal switching points to another ML classifier for credit card fraud detection. It identifies optimal moments to strategically switch between different ML-based defense models (i.e., classifiers) to invalidate any adversarial progress and always stay a step ahead of the adversary. We take this approach in an iterated game theoretic manner where the adversary and defender take turns to take their action in the CCF detection contexts. Via extensive simulation experiments, we investigate the performance of our proposed RESONANT against that of the existing state-of-the-art counterparts in terms of the mean and variance of detection accuracy and attack success ratio to measure the defensive performance. Our results demonstrate the superiority of RESONANT over other counterparts, including static and naïve ML and MTD selecting a defense model at random (i.e., Random-MTD). Via extensive simulation experiments, our results show that our proposed RESONANT can outperform the existing counterparts up to two times better performance in detection accuracy using AUC (i.e., Area Under the Curve of the Receiver Operating Characteristic (ROC) curve) and system security against attacks using attack success ratio (ASR). / Master of Science / According to security.org, as of 2023, 65% of credit card (CC) users in the US have been subjected to fraud at some point in their lives, which equates to about 151 million Americans. The proliferation of advanced machine learning (ML) algorithms has also contributed to detecting credit card fraud (CCF). However, using a single or static ML-based defense model against a constantly evolving adversary takes its structural advantage, which enables the adversary to reverse engineer the defense's strategy over the rounds of an iterated game. This paper proposes an adaptive defense approach based on artificial intelligence (AI), termed RESONANT, to identify the optimal switching points to another ML classifiers for credit card fraud detection. It identifies optimal moments to strategically switch between different ML-based defense models (i.e., classifiers) to invalidate any adversarial progress and always stay a step ahead of the adversary. We take this approach in an iterated game theoretic manner where the adversary and defender take turns to take their action in the CCF detection contexts. Via extensive simulation experiments, we investigate the performance of our proposed RESONANT against that of the existing state-of-the-art counterparts in terms of the mean and variance of detection accuracy and attack success ratio to measure the defensive performance. Our results demonstrate the superiority of RESONANT over other counterparts, showing that our proposed RESONANT can outperform the existing counterparts by up to two times better performance in detection accuracy and system security against attacks.
|
35 |
CREDIT CARD FRAUD DETECTION (Machine learning algorithms) / Kreditkortsbedrägeri med användning av maskininlärningsalgoritmerWesterlund, Fredrik January 2017 (has links)
Credit card fraud is a field with perpetrators performing illegal actions that may affect other individuals or companies negatively. For instance, a criminalcan steal credit card information from an account holder and then conduct fraudulent transactions. The activities are a potential contributory factor to how illegal organizations such as terrorists and drug traffickers support themselves financially. Within the machine learning area, there are several methods that possess the ability to detect credit card fraud transactions; supervised learning and unsupervised learning algorithms. This essay investigates the supervised approach, where two algorithms (Hellinger Distance Decision Tree (HDDT) and Random Forest) are evaluated on a real life dataset of 284,807 transactions. Under those circumstances, the main purpose is to develop a “well-functioning” model with a reasonable capacity to categorize transactions as fraudulent or legit. As the data is heavily unbalanced, reducing the false-positive rate is also an important part when conducting research in the chosen area. In conclusion, evaluated algorithms present a fairly similar outcome, where both models have the capability to distinguish the classes from each other. However, the Random Forest approach has a better performance than HDDT in all measures of interest.
|
36 |
Applying Simulation to the Problem of Detecting Financial FraudLopez-Rojas, Edgar Alonso January 2016 (has links)
This thesis introduces a financial simulation model covering two related financial domains: Mobile Payments and Retail Stores systems. The problem we address in these domains is different types of fraud. We limit ourselves to isolated cases of relatively straightforward fraud. However, in this thesis the ultimate aim is to introduce our approach towards the use of computer simulation for fraud detection and its applications in financial domains. Fraud is an important problem that impact the whole economy. Currently, there is a lack of public research into the detection of fraud. One important reason is the lack of transaction data which is often sensitive. To address this problem we present a mobile money Payment Simulator (PaySim) and Retail Store Simulator (RetSim), which allow us to generate synthetic transactional data that contains both: normal customer behaviour and fraudulent behaviour. These simulations are Multi Agent-Based Simulations (MABS) and were calibrated using real data from financial transactions. We developed agents that represent the clients and merchants in PaySim and customers and salesmen in RetSim. The normal behaviour was based on behaviour observed in data from the field, and is codified in the agents as rules of transactions and interaction between clients and merchants, or customers and salesmen. Some of these agents were intentionally designed to act fraudulently, based on observed patterns of real fraud. We introduced known signatures of fraud in our model and simulations to test and evaluate our fraud detection methods. The resulting behaviour of the agents generate a synthetic log of all transactions as a result of the simulation. This synthetic data can be used to further advance fraud detection research, without leaking sensitive information about the underlying data or breaking any non-disclose agreements. Using statistics and social network analysis (SNA) on real data we calibrated the relations between our agents and generate realistic synthetic data sets that were verified against the domain and validated statistically against the original source. We then used the simulation tools to model common fraud scenarios to ascertain exactly how effective are fraud techniques such as the simplest form of statistical threshold detection, which is perhaps the most common in use. The preliminary results show that threshold detection is effective enough at keeping fraud losses at a set level. This means that there seems to be little economic room for improved fraud detection techniques. We also implemented other applications for the simulator tools such as the set up of a triage model and the measure of cost of fraud. This showed to be an important help for managers that aim to prioritise the fraud detection and want to know how much they should invest in fraud to keep the loses below a desired limit according to different experimented and expected scenarios of fraud.
|
37 |
An Experimental Examination of the Effects of Fraud Specialist and Audit Mindsets on Fraud Risk Assessments and on the Development of Fraud-Related Problem RepresentationsChui, Lawrence 08 1900 (has links)
Fraud risk assessment is an important audit process that has a direct impact on the effectiveness of auditors' fraud detection in an audit. However, prior literature has shown that auditors are generally poor at assessing fraud risk. The Public Company Accounting Oversight Board (PCAOB) suggests that auditors may improve their fraud risk assessment performance by adopting a fraud specialist mindset. A fraud specialist mindset is a special way of thinking about accounting records. While auditors think about the company's recorded transactions in terms of the availability of supporting documentations and the authenticity of the audit trail, fraud specialists think instead of accounting records in terms of the authenticity of the events and activities that are behind the reported transactions. Currently there is no study that has examined the effects of the fraud specialist mindset on auditors' fraud risk assessment performance. In addition, although recent studies have found that fraud specialists are more sensitive than auditors in discerning fraud risk factors in situation where a high level of fraud risk is present, it remains unclear whether the same can be said for situation where the risk of fraud is low. Thus, the purpose of my dissertation is to examine the effects of fraud specialist and audit mindsets on fraud risk assessment performance. In addition, I examined such effects on fraud risk assessment performance in both high and low fraud risk conditions. The contributions of my dissertation include being the first to experimentally examine how different mindsets impact fraud-related judgment. The results of my study have the potential to help address the PCAOB's desire to improve auditors' fraud risk assessment performance though the adoption of the fraud specialist mindset. In addition, my study contributes to the literature by exploring fraud-related problem representation as a possible mediator of mindset on fraud risk assessment performance. I executed my dissertation by conducting an experiment in which mindset (fraud specialist or audit) was induced prior to the completion of an audit case (high or low in fraud risk). A total of 85 senior-level accounting students enrolled in two separate auditing classes participated in my study. The results from my experimental provide empirical support that it is possible to improve auditors' fraud risk assessment through adapting the fraud specialist mindset. My study also provides preliminary evidence that individuals with the fraud specialist mindset developed different problem representations than those with the audit mindset.
|
38 |
Learning-based Attack and Defense on Recommender SystemsPalanisamy Sundar, Agnideven 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The internet is the home for massive volumes of valuable data constantly being created, making it difficult for users to find information relevant to them. In recent times, online users have been relying on the recommendations made by websites to narrow down the options. Online reviews have also become an increasingly important factor in the final choice of a customer. Unfortunately, attackers have found ways to manipulate both reviews and recommendations to mislead users. A Recommendation System is a special type of information filtering system adapted by online vendors to provide suggestions to their customers based on their requirements. Collaborative filtering is one of the most widely used recommendation systems; unfortunately, it is prone to shilling/profile injection attacks. Such attacks alter the recommendation process to promote or demote a particular product. On the other hand, many spammers write deceptive reviews to change the credibility of a product/service. This work aims to address these issues by treating the review manipulation and shilling attack scenarios independently. For the shilling attacks, we build an efficient Reinforcement Learning-based shilling attack method. This method reduces the uncertainty associated with the item selection process and finds the most optimal items to enhance attack reach while treating the recommender system as a black box. Such practical online attacks open new avenues for research in building more robust recommender systems. When it comes to review manipulations, we introduce a method to use a deep structure embedding approach that preserves highly nonlinear structural information and the dynamic aspects of user reviews to identify and cluster the spam users. It is worth mentioning that, in the experiment with real datasets, our method captures about 92\% of all spam reviewers using an unsupervised learning approach.
|
39 |
Imbalanced Learning and Feature Extraction in Fraud Detection with Applications / Obalanserade Metoder och Attribut Aggregering för Upptäcka Bedrägeri, med AppliceringarJacobson, Martin January 2021 (has links)
This thesis deals with fraud detection in a real-world environment with datasets coming from Svenska Handelsbanken. The goal was to investigate how well machine learning can classify fraudulent transactions and how new additional features affected classification. The models used were EFSVM, RUTSVM, CS-SVM, ELM, MLP, Decision Tree, Extra Trees, and Random Forests. To determine the best results the Mathew Correlation Coefficient was used as performance metric, which has been shown to have a medium bias for imbalanced datasets. Each model could deal with high imbalanced datasets which is common for fraud detection. Best results were achieved with Random Forest and Extra Trees. The best scores were around 0.4 for the real-world datasets, though the score itself says nothing as it is more a testimony to the dataset’s separability. These scores were obtained when using aggregated features and not the standard raw dataset. The performance measure recall’s scores were around 0.88-0.93 with an increase in precision by 34.4%-67%, resulting in a large decrease of False Positives. Evaluation results showed a great difference compared to test-runs, either substantial increase or decrease. Two theories as to why are discussed, a great distribution change in the evaluation set, and the sample size increase (100%) for evaluation could have lead to the tests not being well representing of the performance. Feature aggregation were a central topic of this thesis, with the main focus on behaviour features which can describe patterns and habits of customers. For these there were five categories: Sender’s fraud history, Sender’s transaction history, Sender’s time transaction history, Sender’shistory to receiver, and receiver’s history. Out of these, the best performance increase was from the first which gave the top score, the other datasets did not show as much potential, with mostn ot increasing the results. Further studies need to be done before discarding these features, to be certain they don’t improve performance. Together with the data aggregation, a tool (t-SNE) to visualize high dimension data was usedto great success. With it an early understanding of what to expect from newly added features would bring to classification. For the best dataset it could be seen that a new sub-cluster of transactions had been created, leading to the belief that classification scores could improve, whichthey did. Feature selection and PCA-reduction techniques were also studied and PCA showedgood results and increased performance. Feature selection had not conclusive improvements. Over- and under-sampling were used and neither improved the scores, though undersampling could maintain the results which is interesting when increasing the dataset. / Denna avhandling handlar om upptäcka bedrägerier i en real-world miljö med data från Svenska Handelsbanken. Målet var att undersöka hur bra maskininlärning är på att klassificera bedrägliga transaktioner, och hur nya attributer hjälper klassificeringen. Metoderna som användes var EFSVM, RUTSVM, CS-SVM, ELM, MLP, Decision Tree, Extra Trees och Random Forests. För evaluering av resultat används Mathew Correlation Coefficient, vilket har visat sig ha småttt beroende med hänsyn till obalanserade datamängder. Varje modell har inbygda värden för attklara av att bearbeta med obalanserade datamängder, vilket är viktigt för att upptäcka bedrägerier. Resultatmässigt visade det sig att Random Forest och Extra Trees var bäst, utan att göra p-test:s, detta på grund att dataseten var relativt sätt små, vilket gör att små skillnader i resultat ej är säkra. De högsta resultaten var cirka 0.4, det absoluta värdet säger ingenting mer än som en indikation om graden av separation mellan klasserna. De bäst resultaten ficks när nya aggregerade attributer användes och inte standard datasetet. Dessa resultat hade recall värden av 0,88-0,93 och för dessa kunde det synas precision ökade med 34,4% - 67%, vilket ger en stor minskning av False Positives. Evluation-resultaten hade stor skillnad mot test-resultaten, denna skillnad hade antingen en betydande ökning eller minskning. Två anledningar om varför diskuterades, förändring av evaluation-datan mot test-datan eller att storleksökning (100%) för evaluation har lett till att testerna inte var representativa. Attribute-aggregering var ett centralt ämne, med fokus på beteende-mönster för att beskriva kunders vanor. För dessa fanns det fem kategorier: Avsändarens bedrägerihistorik, Avsändarens transaktionshistorik, Avsändarens historik av tid för transaktion, Avsändarens historik till mottagaren och mottagarens historik. Av dessa var den största prestationsökningen från bedrägerihistorik, de andra attributerna hade inte lika positiva resultat, de flesta ökade inte resultaten.Ytterligare mer omfattande studier måste göras innan dessa attributer kan sägas vara givande eller ogivande. Tillsammans med data-aggregering användes t-SNE för att visualisera högdimensionsdata med framgång. Med t-SNE kan en tidig förståelse för vad man kan förvänta sig av tillagda attributer, inom klassificering. För det bästa dataset kan man se att ett nytt kluster som hade skapats, vilket kan tolkas som datan var mer beskrivande. Där förväntades också resultaten förbättras, vilket de gjorde. Val av attributer och PCA-dimensions reducering studerades och PCA-visadeförbättring av resultaten. Over- och under-sampling testades och kunde ej förbättrade resultaten, även om undersampling kunde bibehålla resultated vilket är intressant om datamängden ökar.
|
40 |
Explainable AI methods for credit card fraud detection : Evaluation of LIME and SHAP through a User StudyJi, Yingchao January 2021 (has links)
In the past few years, Artificial Intelligence (AI) has evolved into a powerful tool applied in multi-disciplinary fields to resolve sophisticated problems. As AI becomes more powerful and ubiquitous, oftentimes the AI methods also become opaque, which might lead to trust issues for the users of the AI systems as well as fail to meet the legal requirements of AI transparency. In this report, the possibility of making a credit-card fraud detection support system explainable to users is investigated through a quantitative survey. A publicly available credit card dataset was used. Deep Learning and Random Forest were the two Machine Learning (ML) methodsimplemented and applied on the credit card fraud dataset, and the performance of their results was evaluated in terms of their accuracy, recall, sufficiency, and F1 score. After that, two explainable AI (XAI) methods - SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations) were implemented and applied to the results obtained from these two ML methods. Finally, the XAI results were evaluated through a quantitative survey. The results from the survey revealed that the XAI explanations can slightly increase the users' impression of the system's ability to reason and LIME had a slight advantage over SHAP in terms of explainability. Further investigation of visualizing data pre-processing and the training process is suggested to offer deep explanations for users.
|
Page generated in 0.0899 seconds