Spelling suggestions: "subject:"adversarial machine learning"" "subject:"adversarialt machine learning""
1 |
Automated Attacks on Compression-Based ClassifiersBurago, Igor 29 September 2014 (has links)
Methods of compression-based text classification have proven their usefulness for various applications. However, in some classification problems, such as spam filtering, a classifier confronts one or many adversaries willing to induce errors in the classifier's judgment on certain kinds of input. In this thesis, we consider the problem of finding thrifty strategies for character-based text modification that allow an adversary to revert classifier's verdict on a given family of input texts. We propose three statistical statements of the problem that can be used by an attacker to obtain transformation models which are optimal in some sense. Evaluating these three techniques on a realistic spam corpus, we find that an adversary can transform a spam message (detectable as such by an entropy-based text classifier) into a legitimate one by generating and appending, in some cases, as few additional characters as 20% of the original length of the message.
|
2 |
On the Neural Representation for Adversarial Attack and DefenseQiuling Xu (17121274) 20 October 2023 (has links)
<p dir="ltr">Neural representations are high-dimensional embeddings generated during the feed-forward process of neural networks. These embeddings compress raw input information and extract abstract features beneficial for downstream tasks. However, effectively utilizing these representations poses challenges due to their inherent complexity. This complexity arises from the non-linear relationship between inputs and neural representations, as well as the diversity of the learning process.</p><p dir="ltr">In this thesis, we propose effective methods to utilize neural representations for adversarial attack and defense. Our approach generally involves decomposing complex neural representations into smaller, more analyzable parts. We also seek general patterns emerging during learning to better understand the semantic meaning associated with neural representations.</p><p dir="ltr">We demonstrate that formalizing neural representations can reveal models' weaknesses and aid in defending against poison attacks. Specifically, we define a new type of adversarial attack using neural style, a special component of neural representation. This new attack uncovers novel aspects of the models' vulnerabilities. </p><p dir="ltr">Furthermore, we develop an interpretation of neural representations by approximating their marginal distribution, treating intermediate neurons as feature indicators. By properly harnessing these rich feature indicators, we address scalability and imperceptibility issues related to pixel-wise bounds.</p><p dir="ltr">Finally, we discover that neural representations contain crucial information about how neural networks make decisions. Leveraging the general patterns in neural representations, we design algorithms to remove unwanted and harmful functionalities from neural networks, thereby mitigating poison attacks.</p>
|
3 |
USING RANDOMNESS TO DEFEND AGAINST ADVERSARIAL EXAMPLES IN COMPUTER VISIONHuangyi Ge (14187059) 29 November 2022 (has links)
<p>Computer vision applications such as image classification and object detection often suffer from adversarial examples. For example, adding a small amount of noise to input images can trick the model into misclassification. Over the years, many defense mechanisms have been proposed, and different researchers have made seemingly contradictory claims on their effectiveness. This dissertation first presents an analysis of possible adversarial models and proposes an evaluation framework for comparing different more powerful and realistic adversary strategies. Then, this dissertation proposes two randomness-based defense mechanisms Random Spiking (RS) and MoNet to improve the robustness of image classifiers. Random Spiking generalizes dropout and introduces random noises in the training process in a controlled manner. MoNet uses the combination of secret randomness and Floyd-Steinberg dithering. Specifically, input images are first processed using Floyd-Steinberg dithering to reduce their color depth, and then the pixels are encrypted using the AES block cipher under a secret, random key. Evaluations under our proposed framework suggest RS and MoNet deliver better protection against adversarial examples than many existing schemes. Notably, MoNet significantly improves the resilience against transferability of adversarial examples, at the cost of a small drop in prediction accuracy. Furthermore, we extend the usage of MoNet to the object detection network and use it to align with model ensemble strategies (Affirmative and WBF (weighted fusion boxes)) and Test Time Augmentation (TTA). We call such a strategy 3MIX. Evaluations found that 3Mix can significantly improve the mean average precision (mAP) on both benign inputs and adversarial examples. In addition, 3Mix is a lightweight approach to migrate the adversarial examples without training new models.</p>
|
4 |
Building trustworthy machine learning systems in adversarial environmentsWang, Ning 26 May 2023 (has links)
Modern AI systems, particularly with the rise of big data and deep learning in the last decade, have greatly improved our daily life and at the same time created a long list of controversies. AI systems are often subject to malicious and stealthy subversion that jeopardizes their efficacy. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly boost the accuracy of machine learning models, they also create opportunities for adversaries to tamper with models or extract sensitive data. Malicious data providers can compromise machine learning systems by supplying false data and intermediate computation results. Even a well-trained model can be deceived to misbehave by an adversary who provides carefully designed inputs. Furthermore, curious parties can derive sensitive information of the training data by interacting with a machine-learning model. These adversarial scenarios, known as poisoning attack, adversarial example attack, and inference attack, have demonstrated that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust.
To address these problems, we proposed the following solutions: (1) FLARE, which detects and mitigates stealthy poisoning attacks by leveraging latent space representations; (2) MANDA, which detects adversarial examples by utilizing evaluations from diverse sources, i.e, model-based prediction and data-based evaluation; (3) FeCo which enhances the robustness of machine learning-based network intrusion detection systems by introducing a novel representation learning method; and (4) DP-FedMeta, which preserves data privacy and improves the privacy-accuracy trade-off in machine learning systems through a novel adaptive clipping mechanism. / Doctor of Philosophy / Over the past few decades, machine learning (ML) has become increasingly popular for enhancing efficiency and effectiveness in data analytics and decision-making. Notable applications include intelligent transportation, smart healthcare, natural language generation, intrusion detection, etc. While machine learning methods are often employed for beneficial purposes, they can also be exploited for malicious intents. Well-trained language models have demonstrated generalizability deficiencies and intrinsic biases; generative ML models used for creating art have been repurposed by fraudsters to produce deepfakes; and facial recognition models trained on big data have been found to leak sensitive information about data owners.
Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly improve the accuracy of ML models, they also enable adversaries to corrupt models and infer sensitive data. This leads to various adversarial attacks, such as model poisoning during training, adversarially crafted data in testing, and data inference. It is evident that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust.
This research focuses on building trustworthy machine-learning systems in adversarial environments from a data perspective. It encompasses two themes: securing ML systems against security or privacy vulnerabilities (security of AI) and using ML as a tool to develop novel security solutions (AI for security). For the first theme, we studied adversarial attack detection in both the training and testing phases and proposed FLARE and MANDA to secure matching learning systems in the two phases, respectively. Additionally, we proposed a privacy-preserving learning system, dpfed, to defend against privacy inference attacks. We achieved a good trade-off between accuracy and privacy by proposing an adaptive data clipping and perturbing method. In the second theme, the research is focused on enhancing the robustness of intrusion detection systems through data representation learning.
|
5 |
Adversarial Anomaly DetectionRadhika Bhargava (7036556) 02 August 2019 (has links)
<p>Considerable attention has been given to the vulnerability of machine learning to adversarial samples. This is particularly critical in anomaly detection; uses such as detecting fraud, intrusion, and malware must assume a malicious adversary. We specifically address poisoning attacks, where the adversary injects carefully crafted benign samples into the data, leading to concept drift that causes the anomaly detection to misclassify the actual attack as benign. Our goal is to estimate the vulnerability of an anomaly detection method to an unknown attack, in particular the expected</p>
<p>minimum number of poison samples the adversary would need to succeed. Such an estimate is a necessary step in risk analysis: do we expect the anomaly detection to be sufficiently robust to be useful in the face of attacks? We analyze DBSCAN, LOF,</p>
<p>one-class SVM as an anomaly detection method, and derive estimates for robustness to poisoning attacks. The analytical estimates are validated against the number of poison samples needed for the actual anomalies in standard anomaly detection test</p>
<p>datasets. We then develop defense mechanism, based on the concept drift caused by the poisonous points, to identify that an attack is underway. We show that while it is possible to detect the attacks, it leads to a degradation in the performance of the</p>
<p>anomaly detection method. Finally, we investigate whether the generated adversarial samples for one anomaly detection method transfer to another anomaly detection method.</p>
|
6 |
Robustness of Neural Networks for Discrete Input: An Adversarial PerspectiveEbrahimi, Javid 30 April 2019 (has links)
In the past few years, evaluating on adversarial examples has become a standard
procedure to measure robustness of deep learning models. Literature on adversarial
examples for neural nets has largely focused on image data, which are represented as
points in continuous space. However, a vast proportion of machine learning models
operate on discrete input, and thus demand a similar rigor in understanding their
vulnerabilities and robustness. We study robustness of neural network architectures
for textual and graph inputs, through the lens of adversarial input perturbations.
We will cover methods for both attacks and defense; we will focus on 1) addressing
challenges in optimization for creating adversarial perturbations for discrete data;
2) evaluating and contrasting white-box and black-box adversarial examples; and 3)
proposing efficient methods to make the models robust against adversarial attacks.
|
7 |
Adversarial Deep Learning Against Intrusion Detection ClassifiersRigaki, Maria January 2017 (has links)
Traditional approaches in network intrusion detection follow a signature-based ap- proach, however the use of anomaly detection approaches based on machine learning techniques have been studied heavily for the past twenty years. The continuous change in the way attacks are appearing, the volume of attacks, as well as the improvements in the big data analytics space, make machine learning approaches more alluring than ever. The intention of this thesis is to show that using machine learning in the intrusion detection domain should be accompanied with an evaluation of its robustness against adversaries. Several adversarial techniques have emerged lately from the deep learning research, largely in the area of image classification. These techniques are based on the idea of introducing small changes in the original input data in order to make a machine learning model to misclassify it. This thesis follows a big data Analytics methodol- ogy and explores adversarial machine learning techniques that have emerged from the deep learning domain, against machine learning classifiers used for network intrusion detection. The study looks at several well known classifiers and studies their performance under attack over several metrics, such as accuracy, F1-score and receiver operating character- istic. The approach used assumes no knowledge of the original classifier and examines both general and targeted misclassification. The results show that using relatively sim- ple methods for generating adversarial samples it is possible to lower the detection accuracy of intrusion detection classifiers from 5% to 28%. Performance degradation is achieved using a methodology that is simpler than previous approaches and it re- quires only 6.25% change between the original and the adversarial sample, making it a candidate for a practical adversarial approach.
|
8 |
Defending Against Adversarial Attacks Using Denoising AutoencodersRehana Mahfuz (8617635) 24 April 2020 (has links)
Gradient-based adversarial attacks on neural networks threaten extremely critical applications such as medical diagnosis and biometric authentication. These attacks use the gradient of the neural network to craft imperceptible perturbations to be added to the test data, in an attempt to decrease the accuracy of the network. We propose a defense to combat such attacks, which can be modified to reduce the training time of the network by as much as 71%, and can be further modified to reduce the training time of the defense by as much as 19%. Further, we address the threat of uncertain behavior on the part of the attacker, a threat previously overlooked in the literature that considers mostly white box scenarios. To combat uncertainty on the attacker's part, we train our defense with an ensemble of attacks, each generated with a different attack algorithm, and using gradients of distinct architecture types. Finally, we discuss how we can prevent the attacker from breaking the defense by estimating the gradient of the defense transformation.
|
9 |
A Different Approach to Attacking and Defending Deep Neural NetworksFourati, Fares 06 1900 (has links)
Adversarial examples are among the most widespread attacks in adversarial machine learning. In this work, we define new targeted and non-targeted attacks that are computationally less expensive than standard adversarial attacks. Besides practical purposes in some scenarios, these attacks can improve our understanding of the robustness of machine learning models. Moreover, we introduce a new training scheme to improve the performance of pre-trained neural networks and defend against our attacks. We examine the differences between our method, standard training, and standard adversarial training on pre-trained models. We find that our method protects the networks better against our attacks. Furthermore, unlike usual adversarial training, which reduces standard accuracy when applied to previously trained networks, our method maintains and sometimes even improves standard accuracy.
|
10 |
A Model Extraction Attack on Deep Neural Networks Running on GPUsO'Brien Weiss, Jonah G 09 August 2023 (has links) (PDF)
Deep Neural Networks (DNNs) have become ubiquitous due to their performance on prediction and classification problems. However, they face a variety of threats as their usage spreads. Model extraction attacks, which steal DNN models, endanger intellectual property, data privacy, and security. Previous research has shown that system-level side channels can be used to leak the architecture of a victim DNN, exacerbating these risks. We propose a novel DNN architecture extraction attack, called EZClone, which uses aggregate rather than time-series GPU profiles as a side-channel to predict DNN architecture. This approach is not only simpler, but also requires less adversary capability than earlier works. We investigate the effectiveness of EZClone under various scenarios including reduction of attack complexity, against pruned models, and across GPUs with varied resources. We find that EZClone correctly predicts DNN architectures for the entire set of PyTorch vision architectures with 100\% accuracy. No other work has shown this degree of architecture prediction accuracy with the same adversarial constraints or using aggregate side-channel information. Prior work has shown that, once a DNN has been successfully cloned, further attacks such as model evasion or model inversion can be accelerated significantly. Then, we evaluate several mitigation techniques against EZClone, showing that carefully inserted dummy computation reduces the success rate of the attack.
|
Page generated in 0.0926 seconds