Spelling suggestions: "subject:"adversarial braining"" "subject:"adversarial craining""
1 |
MixUp as Directional Adversarial Training: A Unifying Understanding of MixUp and Adversarial TrainingPerrault Archambault, Guillaume 29 April 2020 (has links)
This thesis aims to contribute to the field of neural networks by improving upon the performance of a state-of-the-art regularization scheme called MixUp, and by contributing to the conceptual understanding of MixUp. MixUp is a data augmentation scheme in which pairs of training samples and their corresponding labels are mixed using linear coefficients. Without label mixing, MixUp becomes a more conventional scheme: input samples are moved but their original labels are retained. Because samples are preferentially moved in the direction of other classes we refer to this method as directional adversarial training, or DAT. We show that under two mild conditions, MixUp asymptotically convergences to a subset of DAT. We define untied MixUp (UMixUp), a superset of MixUp wherein training labels are mixed with different linear coefficients to those of their corresponding samples. We show that under the same mild conditions, untied MixUp converges to the entire class of DAT schemes. Motivated by the understanding that UMixUp is both a generalization of MixUp and a scheme possessing adversarial-training properties, we experiment with different datasets and loss functions to show that UMixUp provides improves performance over MixUp. In short, we present a novel interpretation of MixUp as belonging to a class highly analogous to adversarial training, and on this basis we introduce a simple generalization which outperforms MixUp.
|
2 |
A Different Approach to Attacking and Defending Deep Neural NetworksFourati, Fares 06 1900 (has links)
Adversarial examples are among the most widespread attacks in adversarial machine learning. In this work, we define new targeted and non-targeted attacks that are computationally less expensive than standard adversarial attacks. Besides practical purposes in some scenarios, these attacks can improve our understanding of the robustness of machine learning models. Moreover, we introduce a new training scheme to improve the performance of pre-trained neural networks and defend against our attacks. We examine the differences between our method, standard training, and standard adversarial training on pre-trained models. We find that our method protects the networks better against our attacks. Furthermore, unlike usual adversarial training, which reduces standard accuracy when applied to previously trained networks, our method maintains and sometimes even improves standard accuracy.
|
3 |
Semi-supervised Learning for Real-world Object Recognition using Adversarial AutoencodersMittal, Sudhanshu January 2017 (has links)
For many real-world applications, labeled data can be costly to obtain. Semi-supervised learning methods make use of substantially available unlabeled data along with few labeled samples. Most of the latest work on semi-supervised learning for image classification show performance on standard machine learning datasets like MNIST, SVHN, etc. In this work, we propose a convolutional adversarial autoencoder architecture for real-world data. We demonstrate the application of this architecture for semi-supervised object recognition. We show that our approach can learn from limited labeled data and outperform fully-supervised CNN baseline method by about 4% on real-world datasets. We also achieve competitive performance on the MNIST dataset compared to state-of-the-art semi-supervised learning techniques. To spur research in this direction, we compiled two real-world datasets: Internet (WIS) dataset and Real-world (RW) dataset which consists of more than 20K labeled samples each, comprising of small household objects belonging to ten classes. We also show a possible application of this method for online learning in robotics. / I de flesta verklighetsbaserade tillämpningar kan det vara kostsamt att erhålla märkt data. Inlärningsmetoder som är semi-övervakade använder sig oftast i stor utsträckning av omärkt data med stöd av en liten mängd märkt data. Mycket av det senaste arbetet inom semiövervakade inlärningsmetoder för bildklassificering visar prestanda på standardiserad maskininlärning så som MNIST, SVHN, och så vidare. I det här arbetet föreslår vi en convolutional adversarial autoencoder arkitektur för verklighetsbaserad data. Vi demonstrerar tillämpningen av denna arkitektur för semi-övervakad objektidentifiering och visar att vårt tillvägagångssätt kan lära sig av ett begränsat antal märkt data. Därmed överträffar vi den fullt övervakade CNN-baslinjemetoden med ca. 4% på verklighetsbaserade datauppsättningar. Vi uppnår även konkurrenskraftig prestanda på MNIST datauppsättningen jämfört med moderna semi-övervakade inlärningsmetoder. För att stimulera forskningen i den här riktningen, samlade vi två verklighetsbaserade datauppsättningar: Internet (WIS) och Real-world (RW) datauppsättningar, som består av mer än 20 000 märkta prov vardera, som utgörs av små hushållsobjekt tillhörandes tio klasser. Vi visar också en möjlig tillämpning av den här metoden för online-inlärning i robotik.
|
4 |
Robust Anomaly Detection in Critical InfrastructureAbdelaty, Maged Fathy Youssef 14 September 2022 (has links)
Critical Infrastructures (CIs) such as water treatment plants, power grids and telecommunication networks are critical to the daily activities and well-being of our society. Disruption of such CIs would have catastrophic consequences for public safety and the national economy. Hence, these infrastructures have become major targets in the upsurge of cyberattacks. Defending against such attacks often depends on an arsenal of cyber-defence tools, including Machine Learning (ML)-based Anomaly Detection Systems (ADSs). These detection systems use ML models to learn the profile of the normal behaviour of a CI and classify deviations that go well beyond the normality profile as anomalies. However, ML methods are vulnerable to both adversarial and non-adversarial input perturbations. Adversarial perturbations are imperceptible noises added to the input data by an attacker to evade the classification mechanism. Non-adversarial perturbations can be a normal behaviour evolution as a result of changes in usage patterns or other characteristics and noisy data from normally degrading devices, generating a high rate of false positives. We first study the problem of ML-based ADSs being vulnerable to non-adversarial perturbations, which causes a high rate of false alarms. To address this problem, we propose an ADS called DAICS, based on a wide and deep learning model that is both adaptive to evolving normality and robust to noisy data normally emerging from the system. DAICS adapts the pre-trained model to new normality with a small number of data samples and a few gradient updates based on feedback from the operator on false alarms. The DAICS was evaluated on two datasets collected from real-world Industrial Control System (ICS) testbeds. The results show that the adaptation process is fast and that DAICS has an improved robustness compared to state-of-the-art approaches. We further investigated the problem of false-positive alarms in the ADSs. To address this problem, an extension of DAICS, called the SiFA framework, is proposed. The SiFA collects a buffer of historical false alarms and suppresses every new alarm that is similar to these false alarms. The proposed framework is evaluated using a dataset collected from a real-world ICS testbed. The evaluation results show that the SiFA can decrease the false alarm rate of DAICS by more than 80%.
We also investigate the problem of ML-based network ADSs that are vulnerable to adversarial perturbations. In the case of network ADSs, attackers may use their knowledge of anomaly detection logic to generate malicious traffic that remains undetected. One way to solve this issue is to adopt adversarial training in which the training set is augmented with adversarially perturbed samples. This thesis presents an adversarial training approach called GADoT that leverages a Generative Adversarial Network (GAN) to generate adversarial samples for training. GADoT is validated in the scenario of an ADS detecting Distributed Denial of Service (DDoS) attacks, which have been witnessing an increase in volume and complexity. For a practical evaluation, the DDoS network traffic was perturbed to generate two datasets while fully preserving the semantics of the attack. The results show that adversaries can exploit their domain expertise to craft adversarial attacks without requiring knowledge of the underlying detection model. We then demonstrate that adversarial training using GADoT renders ML models more robust to adversarial perturbations. However, the evaluation of adversarial robustness is often susceptible to errors, leading to robustness overestimation. We investigate the problem of robustness overestimation in network ADSs and propose an adversarial attack called UPAS to evaluate the robustness of such ADSs. The UPAS attack perturbs the inter-arrival time between packets by injecting a random time delay before packets from the attacker. The attack is validated by perturbing malicious network traffic in a multi-attack dataset and used to evaluate the robustness of two robust ADSs, which are based on a denoising autoencoder and an adversarially trained ML model. The results demonstrate that the robustness of both ADSs is overestimated and that a standardised evaluation of robustness is needed.
|
5 |
Robust Neural Receiver in Wireless Communication : Defense against Adversarial AttacksNicklasson Cedbro, Alice January 2023 (has links)
In the field of wireless communication systems, the interest in machine learning has increased in recent years. Adversarial machine learning includes attack and defense methods on machine learning components. It is a topic that has been thoroughly studied in computer vision and natural language processing but not to the same extent in wireless communication. In this thesis, a Fast Gradient Sign Method (FGSM) attack on a neural receiver is studied. Furthermore, the thesis investigates whether it is possible to make a neural receiver robust against these attacks. The study is made using the python library Sionna, a library used for research on for example 5G, 6G and machine learning in wireless communication. The effect of a FGSM attack is evaluated and mitigated with different models within adversarial training. The training data of the models is either augmented with adversarial samples, or original samples are replaced with adversarial ones. Furthermore, the power distribution and range of the adversarial samples included in the training are varied. The thesis concludes that a FGSM attack decreases the performance of a neural receiver and needs less power than a barrage jamming attack to achieve the same performance loss. A neural receiver can be made more robust against a FGSM attack when the training data of the model is augmented with adversarial samples. The samples are concentrated on a specific attack power range and the power of the adversarial samples is normally distributed. A neural receiver is also proven to be more robust against a barrage jamming attack than conventional methods without defenses.
|
6 |
Fear prediction for training robust RL agentsGauthier, Charlie 03 1900 (has links)
Les algorithmes d’apprentissage par renforcement conditionné par les buts apprennent à
accomplir des tâches en interagissant avec leur environnement. Ce faisant, ils apprennent à
propos du monde qui les entourent de façon graduelle et adaptive. Parmi d’autres raisons,
c’est pourquoi cette branche de l’intelligence artificielle est une des avenues les plus promet-
teuses pour le contrôle des robots généralistes de demain. Cependant, la sûreté de ces algo-
rithmes de contrôle restent un champ de recherche actif. La majorité des algorithmes “d’ap-
prentissage par renforcement sûr” tâchent d’assurer la sécurité de la politique de contrôle
tant durant l’apprentissage que pendant le déploiement ou l’évaluation. Dans ce travail, nous
proposons une stratégie complémentaire.
Puisque la majorité des algorithmes de contrôle pour la robotique sont développés, entraî-
nés, et testés en simulation pour éviter d’endommager les vrais robots, nous pouvons nous
permettre d’agir de façon dangereuse dans l’environnement simulé. Nous démontrons qu’en
donnant des buts dangereux à effectuer à l’algorithme d’apprentissage durant l’apprentissage,
nous pouvons produire des populations de politiques de contrôle plus sûres au déploiement
ou à l’évaluation qu’en sélectionnant les buts avec des techniques de l’état de l’art. Pour
ce faire, nous introduisons un nouvel agent à l’entraînement de la politique de contrôle, le
“Directeur”. Le rôle du Directeur est de sélectionner des buts qui sont assez difficiles pour
aider la politique à apprendre à les résoudre sans être trop difficiles ou trop faciles. Pour
aider le Directeur dans sa tâche, nous entraînons un réseau de neurones en ligne pour prédire
sur quels buts la politique de contrôle échouera. Armé de ce “réseau de la peur” (nommé
d’après la peur de la politique de contrôle), le Directeur parviens à sélectionner les buts de
façon à ce que les politiques de contrôles finales sont plus sûres et plus performantes que
les politiques entraînées à l’aide de méthodes de l’état de l’art, ou obtiennent des métriques
semblables. De plus, les populations de politiques entraînées par le Directeur ont moins de
variance dans leur comportement, et sont plus résistantes contre des attaques d’adversaires
sur les buts qui leur sont issus. / By learning from experience, goal-conditioned reinforcement learning methods learn from
their environments gradually and adaptively. Among other reasons, this makes them a
promising direction for the generalist robots of the future. However, the safety of these goal-
conditioned RL policies is still an active area of research. The majority of “Safe Reinforce-
ment Learning” methods seek to enforce safety both during training and during deployment
and/or evaluation. In this work, we propose a complementary strategy.
Because the majority of control algorithms for robots are developed, trained, and tested in
simulation to avoid damaging the real hardware, we can afford to let the policy act in unsafe
ways in the simulated environment. We show that by tasking the learning algorithm with
unsafe goals during its training, we can produce populations of final policies that are safer at
evaluation or deployment than when trained with state-of-the-art goal-selection methods. To
do so, we introduce a new agent to the training of the policy that we call the “Director”. The
Director’s role is to select goals that are hard enough to aid the policy’s training, without
being too hard or too easy. To help the Director in its task, we train a neural network online
to predict which goals are unsafe for the current policy. Armed with this “fear network”
(named after the policy’s own fear of violating its safety conditions), the Director is able
to select training goals such that the final trained policies are safer and more performant
than policies trained on state-of-the-art goal-selection methods (or just as safe/performant).
Additionally, the populations of policies trained by the Director show decreased variance in
their behaviour, along with increased resistance to adversarial attacks on the goals issued to
them.
|
7 |
TOWARDS SECURE AND ROBUST 3D PERCEPTION IN THE REAL WORLD: AN ADVERSARIAL APPROACHZhiyuan Cheng (19104104) 11 July 2024 (has links)
<p dir="ltr">The advent of advanced machine learning and computer vision techniques has led to the feasibility of 3D perception in the real world, which includes but not limited to tasks of monocular depth estimation (MDE), 3D object detection, semantic scene completion, optical flow estimation (OFE), etc. Due to the 3D nature of our physical world, these techniques have enabled various real-world applications like Autonomous Driving (AD), unmanned aerial vehicle (UAV), virtual/augmented reality (VR/AR) and video composition, revolutionizing the field of transportation and entertainment. However, it is well-documented that Deep Neural Network (DNN) models can be susceptible to adversarial attacks. These attacks, characterized by minimal perturbations, can precipitate substantial malfunctions. Considering that 3D perception techniques are crucial for security-sensitive applications, such as autonomous driving systems (ADS), in the real world, adversarial attacks on these systems represent significant threats. As a result, my goal of research is to build secure and robust real-world 3D perception systems. Through the examination of vulnerabilities in 3D perception techniques under such attacks, my dissertation aims to expose and mitigate these weaknesses. Specifically, I propose stealthy physical-world attacks against MDE, a fundamental component in ADS and AR/VR that facilitates the projection from 2D to 3D. I have advanced the stealth of the patch attack by minimizing the patch size and disguising the adversarial pattern, striking an optimal balance between stealth and efficacy. Moreover, I develop single-modal attacks against camera-LiDAR fusion models for 3D object detection, utilizing adversarial patches. This method underscores that mere fusion of sensors does not assure robustness against adversarial attacks. Additionally, I study black-box attacks against MDE and OFE models, which are more practical and impactful as no model details are required and the models can be compromised through only queries. In parallel, I devise a self-supervised adversarial training method to harden MDE models without the necessity of ground-truth depth labels. This enhanced model is capable of withstanding a range of adversarial attacks, including those in the physical world. Through these innovative designs for both attack and defense, this research contributes to the development of more secure and robust 3D perception systems, particularly in the context of the real world applications.</p>
|
8 |
Improving the Robustness of Deep Neural Networks against Adversarial Examples via Adversarial Training with Maximal Coding Rate Reduction / Förbättra Robustheten hos Djupa Neurala Nätverk mot Exempel på en Motpart genom Utbildning för motståndare med Maximal Minskning av KodningshastighetenChu, Hsiang-Yu January 2022 (has links)
Deep learning is one of the hottest scientific topics at the moment. Deep convolutional networks can solve various complex tasks in the field of image processing. However, adversarial attacks have been shown to have the ability of fooling deep learning models. An adversarial attack is accomplished by applying specially designed perturbations on the input image of a deep learning model. The noises are almost visually indistinguishable to human eyes, but can fool classifiers into making wrong predictions. In this thesis, adversarial attacks and methods to improve deep learning ’models robustness against adversarial samples were studied. Five different adversarial attack algorithm were implemented. These attack algorithms included white-box attacks and black-box attacks, targeted attacks and non-targeted attacks, and image-specific attacks and universal attacks. The adversarial attacks generated adversarial examples that resulted in significant drop in classification accuracy. Adversarial training is one commonly used strategy to improve the robustness of deep learning models against adversarial examples. It is shown that adversarial training can provide an additional regularization benefit beyond that provided by using dropout. Adversarial training is performed by incorporating adversarial examples into the training process. Traditionally, during this process, cross-entropy loss is used as the loss function. In order to improve the robustness of deep learning models against adversarial examples, in this thesis we propose two new methods of adversarial training by applying the principle of Maximal Coding Rate Reduction. The Maximal Coding Rate Reduction loss function maximizes the coding rate difference between the whole data set and the sum of each individual class. We evaluated the performance of different adversarial training methods by comparing the clean accuracy, adversarial accuracy and local Lipschitzness. It was shown that adversarial training with Maximal Coding Rate Reduction loss function would yield a more robust network than the traditional adversarial training method. / Djupinlärning är ett av de hetaste vetenskapliga ämnena just nu. Djupa konvolutionella nätverk kan lösa olika komplexa uppgifter inom bildbehandling. Det har dock visat sig att motståndarattacker har förmågan att lura djupa inlärningsmodeller. En motståndarattack genomförs genom att man tillämpar särskilt utformade störningar på den ingående bilden för en djup inlärningsmodell. Störningarna är nästan visuellt omöjliga att särskilja för mänskliga ögon, men kan lura klassificerare att göra felaktiga förutsägelser. I den här avhandlingen studerades motståndarattacker och metoder för att förbättra djupinlärningsmodellers robusthet mot motståndarexempel. Fem olika algoritmer för motståndarattack implementerades. Dessa angreppsalgoritmer omfattade white-box-attacker och black-box-attacker, riktade attacker och icke-målinriktade attacker samt bildspecifika attacker och universella attacker. De negativa attackerna genererade motståndarexempel som ledde till en betydande minskning av klassificeringsnoggrannheten. Motståndsträning är en vanligt förekommande strategi för att förbättra djupinlärningsmodellernas robusthet mot motståndarexempel. Det visas att motståndsträning kan ge en ytterligare regulariseringsfördel utöver den som ges genom att använda dropout. Motståndsträning utförs genom att man införlivar motståndarexempel i träningsprocessen. Traditionellt används under denna process cross-entropy loss som förlustfunktion. För att förbättra djupinlärningsmodellernas robusthet mot motståndarexempel föreslår vi i den här avhandlingen två nya metoder för motståndsträning genom att tillämpa principen om maximal minskning av kodningshastigheten. Förlustfunktionen Maximal Coding Rate Reduction maximerar skillnaden i kodningshastighet mellan hela datamängden och summan av varje enskild klass. Vi utvärderade prestandan hos olika metoder för motståndsträning genom att jämföra ren noggrannhet, motstånds noggrannhet och lokal Lipschitzness. Det visades att motståndsträning med förlustfunktionen Maximal Coding Rate Reduction skulle ge ett mer robust nätverk än den traditionella motståndsträningsmetoden.
|
9 |
Unsupervised Domain Adaptation for Regressive Annotation : Using Domain-Adversarial Training on Eye Image Data for Pupil Detection / Oövervakad domänadaptering för regressionsannotering : Användning av domänmotstående träning på ögonbilder för pupilldetektionZetterström, Erik January 2023 (has links)
Machine learning has seen a rapid progress the last couple of decades, with more and more powerful neural network models continuously being presented. These neural networks require large amounts of data to train them. Labelled data is especially in great demand, but due to the time consuming and costly nature of data labelling, there exists a scarcity for labelled data, whereas there usually is an abundance of unlabelled data. In some cases, data from a certain distribution, or domain, is labelled, whereas the data we actually want to optimise our model on is unlabelled and from another domain. This falls under the umbrella of domain adaptation and the purpose of this thesis is to train a network using domain-adversarial training on eye image datasets consisting of a labelled source domain and an unlabelled target domain, with the goal of performing well on target data, i.e., overcoming the domain gap. This was done on two different datasets: a proprietary dataset from Tobii with real images and the public U2Eyes dataset with synthetic data. When comparing domain-adversarial training to a baseline model trained conventionally on source data and a oracle model trained conventionally on target data, the proposed DAT-ResNet model outperformed the baseline on both datasets. For the Tobii dataset, DAT-ResNet improved the Huber loss by 22.9% and the Intersection over Union (IoU) by 7.6%, and for the U2Eyes dataset, DAT-ResNet improved the Huber loss by 67.4% and the IoU by 37.6%. Furthermore, the IoU measures were extended to also include the portion of predicted ellipsis with no intersection with the corresponding ground truth ellipsis – referred to as zero-IoUs. By this metric, the proposed model improves the percentage of zero-IoUs by 34.9% on the Tobii dataset and by 90.7% on the U2Eyes dataset. / Maskininlärning har sett en snabb utveckling de senaste decennierna med mer och mer kraftfulla neurala nätverk-modeller presenterades kontinuerligt. Dessa neurala nätverk kräver stora mängder data för att tränas. Data med etiketter är det framförallt stor efterfrågan på, men på grund av det är tidskrävande och kostsamt att etikettera data så finns det en brist på sådan data medan det ofta finns ett överflöd av data utan etiketter. I vissa fall så är data från en viss fördelning, eller domän, etiketterad, medan datan som vi faktiskt vill optimera vår modell efter saknar etiketter och är från en annan domän. Det här faller under området domänadaptering och målet med det här arbetet är att träna ett nätverk genom att använda domänmoststående träning på dataset med ögonbilder som har en källdomän med etiketter och en måldomän utan etiketter, där målet är att prestera bra på data från måldomänen, i.e., att lösa ett domänadapteringsproblem. Det här gjordes på två olika dataset: ett dataset som ägs av Tobii med riktiga ögonbilder och det offentliga datasetet U2Eyes med syntetiska bilder. När domänadapteringsmodellen jämförs med en basmodell tränad konventionellt på källdata och en orakelmodell tränad konventionellt på måldata, så utklassar den presenterade DAT-ResNet-modellen basmodellen på båda dataseten. På Tobii-datasetet så förbättrade DAT-ResNet förlusten med 22.9% och Intersection over Union (IoU):n med 7.6%, och på U2Eyes-datasetet, förbättrade DAT-ResNet förlusten med 67.4% och IoU:n med 37.6%. Dessutom så utökades IoU-måtten till att också innefatta andelen av förutspådda ellipser utan något överlapp med tillhörande grundsanningsellipser – refererat till som noll-IoU:er. Enligt detta mått så förbättrar den föreslagna modellen noll-IoU:erna med 34.9% på Tobii-datasetet och 90.7% på U2Eyes-datasetet.
|
10 |
Advancing adversarial robustness with feature desensitization and synthesized dataBayat, Reza 07 1900 (has links)
Cette thèse porte sur la question critique de la vulnérabilité des modèles d’apprentissage profond face aux attaques adversariales. Susceptibles à de légères perturbations invisibles à l'œil humain, ces modèles peuvent produire des prédictions erronées. Les attaques adversariales représentent une menace importante quant à l’utilisation de ces modèles dans des systèmes de sécurité critique. Pour atténuer ces risques, l’entraînement adversarial s’impose comme une approche prometteuse, consistant à entraîner les modèles sur des exemples adversariaux pour renforcer leur robustesse.
Dans le Chapitre 1, nous offrons un aperçu détaillé de la vulnérabilité adversariale, en décrivant la création d’échantillons adversariaux ainsi que leurs répercussions dans le monde réel. Nous expliquons le processus de conception de ces exemples et présentons divers scénarios illustrant leurs conséquences potentiellement catastrophiques. En outre, nous examinons les défis associés à l'entraînement adversarial, en mettant l’emphase sur des défis tels que le manque de robustesse face à une large gamme d’attaques et le compromis entre robustesse et généralisation, qui sont au cœur de cette étude.
Le Chapitre 2 présente la Désensibilisation des Caractéristiques Adversariales (AFD), une méthode innovante utilisant des techniques d’adaptation de domaine pour renforcer la robustesse adversariale. L’AFD vise à apprendre des caractéristiques invariantes aux perturbations adversariales, augmentant ainsi la résilience face à divers types et intensités d’attaques. Cette approche consiste à entraîner simultanément un discriminateur de domaine et un classificateur afin de réduire la divergence entre les représentations de données naturelles et adversariales. En alignant les caractéristiques des deux domaines, l'AFD garantit que les caractéristiques apprises sont à la fois prédictives et robustes, atténuant ainsi le surapprentissage à des schémas d'attaque spécifiques et favorisant une défense plus globale.
Le Chapitre 3 présente l’Entraînement Adversarial avec Données Synthétisées, une méthode visant à combler l’écart entre la robustesse et la généralisation des réseaux de neurones. En utilisant des données synthétisées générées par des techniques avancées, ce chapitre explore comment l'incorporation de telles données peut atténuer le surapprentissage et améliorer la performance globale des modèles entraînés adversarialement. Les résultats montrent que, bien que l’entraînement adversarial soit souvent confronté à un compromis entre robustesse et généralisation, l’utilisation de données synthétisées permet de maintenir une haute précision des données corrompues et hors distribution sans compromettre la robustesse. Cette approche offre une voie prometteuse pour développer des réseaux de neurones à la fois résilients aux attaques adversariales et capables de bien généraliser à de nombreux scénarios.
Le Chapitre 4 conclut la thèse en résumant les principales découvertes et contributions de cette recherche. De plus, il propose plusieurs pistes pour des recherches futures visant à améliorer davantage la sécurité et la fiabilité des modèles d’apprentissage profond. Ces pistes incluent l’exploration de l’effet des données synthétisées sur une gamme plus large de tâches de généralisation, le développement d’approches alternatives moins coûteuses en termes de calcul d’entraînement, et l’adaptation de nouvelles techniques guidées par l’information en retour pour synthétiser des données qui favorise l’efficacité d’échantillonnage. En suivant ces directions, les recherches futures pourront s’appuyer sur les bases présentées dans cette thèse et continuer à faire progresser le domaine de la robustesse adversariale, menant à des systèmes d’apprentissage automatique plus sécuritaires et plus fiables.
À travers ces contributions, cette thèse avance la compréhension de la robustesse adversariale et propose des solutions pratiques pour améliorer la sécurité et la fiabilité des systèmes d'apprentissage automatique. En abordant les limites des méthodes actuelles d'entraînement adversarial et en introduisant des approches innovatrices comme l'AFD et l'incorporation de données synthétisées, cette recherche ouvre le chemin à des modèles d'apprentissage automatique plus robustes et généralisables. / This thesis addresses the critical issue of adversarial vulnerability in deep learning models, which are susceptible to slight, human-imperceptible perturbations that can lead to incorrect predictions. Adversarial attacks pose significant threats to the deployment of these models in safety-critical systems. To mitigate these threats, adversarial training has emerged as a prominent approach, where models are trained on adversarial examples to enhance their robustness.
In Chapter 1, we provide a comprehensive background on adversarial vulnerability, detailing the creation of adversarial examples and their real-world implications. We illustrate how adversarial examples are crafted and present various scenarios demonstrating their potential catastrophic outcomes. Furthermore, we explore the challenges associated with adversarial training, focusing on issues like the lack of robustness against a broad range of attack strengths and a trade-off between robustness and generalization, which are the subjects of our study.
Chapter 2 introduces Adversarial Feature Desensitization (AFD), a novel method that leverages domain adaptation techniques to enhance adversarial robustness. AFD aims to learn features that are invariant to adversarial perturbations, thereby improving resilience across various attack types and strengths. This approach involves training a domain discriminator alongside the classifier to reduce the divergence between natural and adversarial data representations. By aligning the features from both domains, AFD ensures that the learned features are both predictive and robust, mitigating overfitting to specific attack patterns and promoting broader defensive capability.
Chapter 3 presents Adversarial Training with Synthesized Data, a method aimed at bridging the gap between robustness and generalization in neural networks. By leveraging synthesized data generated through advanced techniques, this chapter explores how incorporating such data can mitigate robust overfitting and enhance the overall performance of adversarially trained models. The findings indicate that while adversarial training traditionally faces a trade-off between robustness and generalization, the use of synthesized data helps maintain high accuracy on corrupted and out-of-distribution data without compromising robustness. This approach provides a promising pathway to develop neural networks that are both resilient to adversarial attacks and capable of generalizing well to a wide range of scenarios.
Chapter 4 concludes the thesis by summarizing the key findings and contributions of this thesis. Additionally, it outlines several avenues for future research to further enhance the security and reliability of deep learning models. Future research could explore the effect of synthesized data on a broader range of generalization tasks, develop alternative approaches to adversarial training that are less computationally expensive, and adapt new feedback-guided techniques for synthesizing data to enhance sample efficiency. By pursuing these directions, future research can build on the foundations laid by this thesis and continue to advance the field of adversarial robustness, ultimately leading to safer and more reliable machine learning systems.
Through these contributions, this thesis advances the understanding of adversarial robustness and proposes practical solutions to enhance the security and reliability of machine learning systems. By addressing the limitations of current adversarial training methods and introducing innovative approaches like AFD and the incorporation of synthesized data, this research paves the way for more robust and generalizable machine learning models capable of withstanding a diverse array of adversarial attacks.
|
Page generated in 0.1062 seconds