1 |
On the Neural Representation for Adversarial Attack and DefenseQiuling Xu (17121274) 20 October 2023 (has links)
<p dir="ltr">Neural representations are high-dimensional embeddings generated during the feed-forward process of neural networks. These embeddings compress raw input information and extract abstract features beneficial for downstream tasks. However, effectively utilizing these representations poses challenges due to their inherent complexity. This complexity arises from the non-linear relationship between inputs and neural representations, as well as the diversity of the learning process.</p><p dir="ltr">In this thesis, we propose effective methods to utilize neural representations for adversarial attack and defense. Our approach generally involves decomposing complex neural representations into smaller, more analyzable parts. We also seek general patterns emerging during learning to better understand the semantic meaning associated with neural representations.</p><p dir="ltr">We demonstrate that formalizing neural representations can reveal models' weaknesses and aid in defending against poison attacks. Specifically, we define a new type of adversarial attack using neural style, a special component of neural representation. This new attack uncovers novel aspects of the models' vulnerabilities. </p><p dir="ltr">Furthermore, we develop an interpretation of neural representations by approximating their marginal distribution, treating intermediate neurons as feature indicators. By properly harnessing these rich feature indicators, we address scalability and imperceptibility issues related to pixel-wise bounds.</p><p dir="ltr">Finally, we discover that neural representations contain crucial information about how neural networks make decisions. Leveraging the general patterns in neural representations, we design algorithms to remove unwanted and harmful functionalities from neural networks, thereby mitigating poison attacks.</p>
|
2 |
Learning Unsupervised Depth Estimation, from Stereo to Monocular ImagesPilzer, Andrea 22 June 2020 (has links)
In order to interact with the real world, humans need to perform several tasks such as object detection, pose estimation, motion estimation and distance estimation. These tasks are all part of scene understanding and are fundamental tasks of computer vision. Depth estimation received unprecedented attention from the research community in recent years due to the growing interest in its practical applications (ie robotics, autonomous driving, etc.) and the performance improvements achieved with deep learning. In fact, the applications expanded from the more traditional tasks such as robotics to new fields such as autonomous driving, augmented reality devices and smartphones applications. This is due to several factors. First, with the increased availability of training data, bigger and bigger datasets were collected. Second, deep learning frameworks running on graphical cards exponentially increased the data processing capabilities allowing for higher precision deep convolutional networks, ConvNets, to be trained. Third, researchers applied unsupervised optimization objectives to ConvNets overcoming the hurdle of collecting expensive ground truth and fully exploiting the abundance of images available in datasets.
This thesis addresses several proposals and their benefits for unsupervised depth estimation, i.e., (i) learning from resynthesized data, (ii) adversarial learning, (iii) coupling generator and discriminator losses for collaborative training, and (iv) self-improvement ability of the learned model. For the first two points, we developed a binocular stereo unsupervised depth estimation model that uses reconstructed data as an additional self-constraint during training. In addition to that, adversarial learning improves the quality of the reconstructions, further increasing the performance of the model. The third point is inspired by scene understanding as a structured task. A generator and a discriminator joining their efforts in a structured way improve the quality of the estimations. Our intuition may sound counterintuitive when cast in the general framework of adversarial learning. However, in our experiments we demonstrate the effectiveness of the proposed approach. Finally, self-improvement is inspired by estimation refinement, a widespread practice in dense reconstruction tasks like depth estimation. We devise a monocular unsupervised depth estimation approach, which measures the reconstruction errors in an unsupervised way, to produce a refinement of the depth predictions. Furthermore, we apply knowledge distillation to improve the student ConvNet with the knowledge of the teacher ConvNet that has access to the errors.
|
3 |
Adversarial Learning based framework for Anomaly Detection in the context of Unmanned Aerial SystemsBhaskar, Sandhya 18 June 2020 (has links)
Anomaly detection aims to identify the data samples that do not conform to a known normal (regular) behavior. As the definition of an anomaly is often ambiguous, unsupervised and semi-supervised deep learning (DL) algorithms that primarily use unlabeled datasets to model normal (regular) behaviors, are popularly studied in this context. The unmanned aerial system (UAS) can use contextual anomaly detection algorithms to identify interesting objects of concern in applications like search and rescue, disaster management, public security etc. This thesis presents a novel multi-stage framework that supports detection of frames with unknown anomalies, localization of anomalies in the detected frames, and validation of detected frames for incremental semi-supervised learning, with the help of a human operator. The proposed architecture is tested on two new datasets collected for a UAV-based system. In order to detect and localize anomalies, it is important to both model the normal data distribution accurately as well as formulate powerful discriminant (anomaly scoring) techniques. We implement a generative adversarial network (GAN)-based anomaly detection architecture to study the effect of loss terms and regularization on the modeling of normal (regular) data and arrive at the most effective anomaly scoring method for the given application. Following this, we use incremental semi-supervised learning techniques that utilize a small set of labeled data (obtained through validation from a human operator), with large unlabeled datasets to improve the knowledge-base of the anomaly detection system. / Master of Science / Anomaly detection aims to identify the data samples that do not conform to a known normal (regular) behavior. As the definition of an anomaly is often ambiguous, most techniques use unlabeled datasets, to model normal (regular) behaviors. The availability of large unlabeled datasets combined with novel applications in various domains, has led to an increasing interest in the study of anomaly detection. In particular, the unmanned aerial system (UAS) can use contextual anomaly detection algorithms to identify interesting objects of concern in applications like search and rescue (SAR), disaster management, public security etc. This thesis presents a novel multi-stage framework that supports detection and localization of unknown anomalies, as well as the validation of detected anomalies, for incremental learning, with the help of a human operator. The proposed architecture is tested on two new datasets collected for a UAV-based system. In order to detect and localize anomalies, it is important to both model the normal data distribution accurately and formulate powerful discriminant (anomaly scoring) techniques. To this end, we study the state-of-the-art generative adversarial networks (GAN)-based anomaly detection algorithms for modeling of normal (regular) behavior and formulate effective anomaly detection scores. We also propose techniques to incrementally learn the new normal data as well as anomalies, using the validation provided by a human operator. This framework is introduced with the aim to support temporally critical applications that involve human search and rescue, particularly in disaster management.
|
4 |
ACADIA: Efficient and Robust Adversarial Attacks Against Deep Reinforcement LearningAli, Haider 05 January 2023 (has links)
Existing adversarial algorithms for Deep Reinforcement Learning (DRL) have largely focused on identifying an optimal time to attack a DRL agent. However, little work has been explored in injecting efficient adversarial perturbations in DRL environments. We propose a suite of novel DRL adversarial attacks, called ACADIA, representing AttaCks Against Deep reInforcement leArning. ACADIA provides a set of efficient and robust perturbation-based adversarial attacks to disturb the DRL agent's decision-making based on novel combinations of techniques utilizing momentum, ADAM optimizer (i.e., Root Mean Square Propagation or RMSProp), and initial randomization. These kinds of DRL attacks with novel integration of such techniques have not been studied in the existing Deep Neural Networks (DNNs) and DRL research. We consider two well-known DRL algorithms, Deep-Q Learning Network (DQN) and Proximal Policy Optimization (PPO), under Atari games and MuJoCo where both targeted and non-targeted attacks are considered with or without the state-of-the-art defenses in DRL (i.e., RADIAL and ATLA). Our results demonstrate that the proposed ACADIA outperforms existing gradient-based counterparts under a wide range of experimental settings. ACADIA is nine times faster than the state-of-the-art Carlini and Wagner (CW) method with better performance under defenses of DRL. / Master of Science / Artificial Intelligence (AI) techniques such as Deep Neural Networks (DNN) and Deep Reinforcement Learning (DRL) are prone to adversarial attacks. For example, a perturbed stop sign can force a self-driving car's AI algorithm to increase the speed rather than stop the vehicle. There has been little work developing attacks and defenses against DRL. In DRL, a DNN-based policy decides to take an action based on the observation of the environment and gets the reward in feedback for its improvements. We perturb that observation to attack the DRL agent. There are two main aspects to developing an attack on DRL. One aspect is to identify an optimal time to attack (when-to-attack?). The second aspect is to identify an efficient method to attack (how-to-attack?). To answer the second aspect, we propose a suite of novel DRL adversarial attacks, called ACADIA, representing AttaCks Against Deep reInforcement leArning. We consider two well-known DRL algorithms, Deep-Q Learning Network (DQN) and Proximal Policy Optimization (PPO), under DRL environments of Atari games and MuJoCo where both targeted and non-targeted attacks are considered with or without state-of-the-art defenses. Our results demonstrate that the proposed ACADIA outperforms state-of-the-art perturbation methods under a wide range of experimental settings. ACADIA is nine times faster than the state-of-the-art Carlini and Wagner (CW) method with better performance under the defenses of DRL.
|
5 |
Prediction games : machine learning in the presence of an adversaryBrückner, Michael January 2012 (has links)
In many applications one is faced with the problem of inferring some functional relation between input and output variables from given data. Consider, for instance, the task of email spam filtering where one seeks to find a model which automatically assigns new, previously unseen emails to class spam or non-spam. Building such a predictive model based on observed training inputs (e.g., emails) with corresponding outputs (e.g., spam labels) is a major goal of machine learning.
Many learning methods assume that these training data are governed by the same distribution as the test data which the predictive model will be exposed to at application time. That assumption is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for instance, in the above example of email spam filtering. Here, email service providers employ spam filters and spam senders engineer campaign templates such as to achieve a high rate of successful deliveries despite any filters.
Most of the existing work casts such situations as learning robust models which are unsusceptible against small changes of the data generation process. The models are constructed under the worst-case assumption that these changes are performed such to produce the highest possible adverse effect on the performance of the predictive model. However, this approach is not capable to realistically model the true dependency between the model-building process and the process of generating future data. We therefore establish the concept of prediction games: We model the interaction between a learner, who builds the predictive model, and a data generator, who controls the process of data generation, as an one-shot game. The game-theoretic framework enables us to explicitly model the players' interests, their possible actions, their level of knowledge about each other, and the order at which they decide for an action.
We model the players' interests as minimizing their own cost function which both depend on both players' actions. The learner's action is to choose the model parameters and the data generator's action is to perturbate the training data which reflects the modification of the data generation process with respect to the past data.
We extensively study three instances of prediction games which differ regarding the order in which the players decide for their action. We first assume that both player choose their actions simultaneously, that is, without the knowledge of their opponent's decision. We identify conditions under which this Nash prediction game has a meaningful solution, that is, a unique Nash equilibrium, and derive algorithms that find the equilibrial prediction model. As a second case, we consider a data generator who is potentially fully informed about the move of the learner. This setting establishes a Stackelberg competition. We derive a relaxed optimization criterion to determine the solution of this game and show that this Stackelberg prediction game generalizes existing prediction models. Finally, we study the setting where the learner observes the data generator's action, that is, the (unlabeled) test data, before building the predictive model. As the test data and the training data may be governed by differing probability distributions, this scenario reduces to learning under covariate shift. We derive a new integrated as well as a two-stage method to account for this data set shift.
In case studies on email spam filtering we empirically explore properties of all derived models as well as several existing baseline methods. We show that spam filters resulting from the Nash prediction game as well as the Stackelberg prediction game in the majority of cases outperform other existing baseline methods. / Eine der Aufgabenstellungen des Maschinellen Lernens ist die Konstruktion von Vorhersagemodellen basierend auf gegebenen Trainingsdaten. Ein solches Modell beschreibt den Zusammenhang zwischen einem Eingabedatum, wie beispielsweise einer E-Mail, und einer Zielgröße; zum Beispiel, ob die E-Mail durch den Empfänger als erwünscht oder unerwünscht empfunden wird. Dabei ist entscheidend, dass ein gelerntes Vorhersagemodell auch die Zielgrößen zuvor unbeobachteter Testdaten korrekt vorhersagt.
Die Mehrzahl existierender Lernverfahren wurde unter der Annahme entwickelt, dass Trainings- und Testdaten derselben Wahrscheinlichkeitsverteilung unterliegen. Insbesondere in Fällen in welchen zukünftige Daten von der Wahl des Vorhersagemodells abhängen, ist diese Annahme jedoch verletzt. Ein Beispiel hierfür ist das automatische Filtern von Spam-E-Mails durch E-Mail-Anbieter. Diese konstruieren Spam-Filter basierend auf zuvor empfangenen E-Mails. Die Spam-Sender verändern daraufhin den Inhalt und die Gestaltung der zukünftigen Spam-E-Mails mit dem Ziel, dass diese durch die Filter möglichst nicht erkannt werden.
Bisherige Arbeiten zu diesem Thema beschränken sich auf das Lernen robuster Vorhersagemodelle welche unempfindlich gegenüber geringen Veränderungen des datengenerierenden Prozesses sind. Die Modelle werden dabei unter der Worst-Case-Annahme konstruiert, dass diese Veränderungen einen maximal negativen Effekt auf die Vorhersagequalität des Modells haben. Diese Modellierung beschreibt die tatsächliche Wechselwirkung zwischen der Modellbildung und der Generierung zukünftiger Daten nur ungenügend. Aus diesem Grund führen wir in dieser Arbeit das Konzept der Prädiktionsspiele ein. Die Modellbildung wird dabei als mathematisches Spiel zwischen einer lernenden und einer datengenerierenden Instanz beschrieben. Die spieltheoretische Modellierung ermöglicht es uns, die Interaktion der beiden Parteien exakt zu beschreiben. Dies umfasst die jeweils verfolgten Ziele, ihre Handlungsmöglichkeiten, ihr Wissen übereinander und die zeitliche Reihenfolge, in der sie agieren.
Insbesondere die Reihenfolge der Spielzüge hat einen entscheidenden Einfluss auf die spieltheoretisch optimale Lösung. Wir betrachten zunächst den Fall gleichzeitig agierender Spieler, in welchem sowohl der Lerner als auch der Datengenerierer keine Kenntnis über die Aktion des jeweils anderen Spielers haben. Wir leiten hinreichende Bedingungen her, unter welchen dieses Spiel eine Lösung in Form eines eindeutigen Nash-Gleichgewichts besitzt. Im Anschluss diskutieren wir zwei verschiedene Verfahren zur effizienten Berechnung dieses Gleichgewichts. Als zweites betrachten wir den Fall eines Stackelberg-Duopols. In diesem Prädiktionsspiel wählt der Lerner zunächst das Vorhersagemodell, woraufhin der Datengenerierer in voller Kenntnis des Modells reagiert. Wir leiten ein relaxiertes Optimierungsproblem zur Bestimmung des Stackelberg-Gleichgewichts her und stellen ein mögliches Lösungsverfahren vor. Darüber hinaus diskutieren wir, inwieweit das Stackelberg-Modell bestehende robuste Lernverfahren verallgemeinert. Abschließend untersuchen wir einen Lerner, der auf die Aktion des Datengenerierers, d.h. der Wahl der Testdaten, reagiert. In diesem Fall sind die Testdaten dem Lerner zum Zeitpunkt der Modellbildung bekannt und können in den Lernprozess einfließen. Allerdings unterliegen die Trainings- und Testdaten nicht notwendigerweise der gleichen Verteilung. Wir leiten daher ein neues integriertes sowie ein zweistufiges Lernverfahren her, welche diese Verteilungsverschiebung bei der Modellbildung berücksichtigen.
In mehreren Fallstudien zur Klassifikation von Spam-E-Mails untersuchen wir alle hergeleiteten, sowie existierende Verfahren empirisch. Wir zeigen, dass die hergeleiteten spieltheoretisch-motivierten Lernverfahren in Summe signifikant bessere Spam-Filter erzeugen als alle betrachteten Referenzverfahren.
|
6 |
ADVERSARIAL LEARNING ON ROBUSTNESS AND GENERATIVE MODELSQingyi Gao (11211114) 03 August 2021 (has links)
<div>In this dissertation, we study two important problems in the area of modern deep learning: adversarial robustness and adversarial generative model. In the first part, we study the generalization performance of deep neural networks (DNNs) in adversarial learning. Recent studies have shown that many machine learning models are vulnerable to adversarial attacks, but much remains unknown concerning its generalization error in this scenario. We focus on the $\ell_\infty$ adversarial attacks produced under the fast gradient sign method (FGSM). We establish a tight bound for the adversarial Rademacher complexity of DNNs based on both spectral norms and ranks of weight matrices. The spectral norm and rank constraints imply that this class of networks can be realized as a subset of the class of a shallow network composed with a low dimensional Lipschitz continuous function. This crucial observation leads to a bound that improves the dependence on the network width compared to previous works and achieves depth independence. We show that adversarial Rademacher complexity is always larger than its natural counterpart, but the effect of adversarial perturbations can be limited under our weight normalization framework. </div><div></div><div>In the second part, we study deep generative models that receive great success in many fields. It is well-known that the complex data usually does not populate its ambient Euclidean space but resides in a lower-dimensional manifold instead. Thus, misspecifying the latent dimension in generative models will result in a mismatch of latent representations and poor generative qualities. To address these problems, we propose a novel framework called Latent Wasserstein GAN (LWGAN) to fuse the auto-encoder and WGAN such that the intrinsic dimension of data manifold can be adaptively learned by an informative latent distribution. In particular, we show that there exist an encoder network and a generator network in such a way that the intrinsic dimension of the learned encodes distribution is equal to the dimension of the data manifold. Theoretically, we prove the consistency of the estimation for the intrinsic dimension of the data manifold and derive a generalization error bound for LWGAN. Comprehensive empirical experiments verify our framework and show that LWGAN is able to identify the correct intrinsic dimension under several scenarios, and simultaneously generate high-quality synthetic data by samples from the learned latent distribution. </div><div><br></div>
|
7 |
NON-INTRUSIVE WIRELESS SENSING WITH MACHINE LEARNINGYUCHENG XIE (16558152) 30 August 2023 (has links)
<p>This dissertation explores the world of non-intrusive wireless sensing for diet and fitness activity monitoring, in addition to assessing security risks in human activity recognition (HAR). It delves into the use of WiFi and millimeter wave (mmWave) signals for monitoring eating behaviors, discerning intricate eating activities, and observing fitness movements. The proposed systems harness variations in wireless signal propagation to record human behavior while providing exhaustive details on dietary and exercise habits. Significant contributions encompass unsupervised learning methodologies for detecting dietary and fitness activities, implementing soft-decision and deep neural networks for assorted activity recognition, constructing tiny motion mechanisms for subtle mouth muscle movement recovery, employing space-time-velocity features for multi-person tracking, as well as utilizing generative adversarial networks and domain adaptation structures to enable less cumbersome training efforts and cross-domain deployments. A series of comprehensive tests validate the efficacy and precision of the proposed non-intrusive wireless sensing systems. Additionally, the dissertation probes the security vulnerabilities in mmWave-based HAR systems and puts forth various sophisticated adversarial attacks - targeted, untargeted, universal, and black-box. It designs adversarial perturbations aiming to deceive the HAR models whilst striving to minimize detectability. The research offers powerful insights into issues and efficient solutions relative to non-intrusive sensing tasks and security challenges linked with wireless sensing technologies.</p>
|
8 |
Distribution-Based Adversarial Multiple-Instance LearningChen, Sherry 27 January 2023 (has links)
No description available.
|
9 |
Motor Imagery Signal Classification using Adversarial Learning - A Systematic Literature ReviewMahmudi, Osama, Mishra, Shubhra January 2023 (has links)
Context: Motor Imagery (MI) signal classification is a crucial task for developing Brain-Computer Interfaces (BCIs) that allow people to control devices using their thoughts. However, traditional machine learning approaches often suffer from limited performance due to inter-subject variability and limited data availability. In response, adversarial learning has emerged as a promising solution to enhance the resilience and accuracy of BCI systems. However, to the best of our knowledge, there has not been a review of the literature on adversarial learning specifically focusing on MI classification. Objective: The objective of this thesis is to perform a Systematic Literature Review (SLR) focusing on the latest techniques of adversarial learning used to classify motor imagery signals. It aims to analyze the publication trends of the reviewed studies, investigate their use-cases, and identify the challenges in the field. Additionally, this research recognizes the datasets used in previous studies and their associated use-cases. It also identifies the pre-processing and adversarial learning techniques, and compare their performance. Additionally, it could aid in evaluating the replicability of the studies included. The outcomes of this study will assist future researchers in selecting appropriate datasets, pre-processing, and adversarial learning techniques to advance their research objectives. The comparison of models will also provide practical insights, enabling researchers to make informed decisions when designing models for motor imagery classification. Furthermore, assessing reproducibility might help in validating the research outcomes and hence elevate the overall quality of future research. Method: A thorough and systematic search following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines is undertaken to gather primary research articles from several databases such as Scopus, Web of Science, IEEEXplore, PubMed, and ScienceDirect. Two independent reviewers evaluated the articles obtained based on predetermined eligibility criteria at the title-abstract level, and their agreement was measured using Cohen's Kappa. The articles that fulfill the criteria are then scrutinized at the full-text level by the same reviewers. Any discrepancies are resolved by the judge – played by the supervisor. Critical appraisal was employed to choose appropriate studies for data extraction, which was subsequently examined using bibliometric and descriptive analyses to answer the research questions. Result: The study's findings indicate substantial growth within the domain over the past six years, notably propelled by contributions from the Asian region. However, the need for augmented collaboration becomes evident as evidenced by the prevalence of insular co-author networks. Four principal use-cases for adversarial learning are identified, spanning data augmentation, domain adaptation, feature extraction, and artifact removal. The favored datasets are BCI Competition IV's 2a and 2b, often accompanied by band-pass filtering and exponential moving standardization preprocessing. This study identifies two primary adversarial learning techniques: GAN and Adversarial Training. GAN is mainly used for data augmentation and artifact removal, while adversarial training is employed for domain adaptation and feature extraction. Based on the results reported in the chosen papers, the accuracy achieved for data augmentation and domain adaptation use cases is nearly identical at 95.3%, while the highest accuracy for the feature extraction use case is 86.91%. However, for artifact removal, both correlation and root mean square methods have been referenced. Furthermore, a reproducibility table has been established which may help in evaluating the replicability of the selected studies . Conclusion: The outcomes provide researchers with valuable perspectives on less-explored areas that hold room for additional enhancement. Ultimately, these perspectives hold the promise of improving the practical applications intended to support individuals dealing with motor impairments.
|
10 |
A Deep Learning Approach to Predict Full-Field Stress Distribution in Composite MaterialsSepasdar, Reza 17 May 2021 (has links)
This thesis proposes a deep learning approach to predict stress at various stages of mechanical loading in 2-D representations of fiber-reinforced composites. More specifically, the full-field stress distribution at elastic and at an early stage of damage initiation is predicted based on the microstructural geometry. The required data set for the purposes of training and validation are generated via high-fidelity simulations of several randomly generated microstructural representations with complex geometries. Two deep learning approaches are employed and their performances are compared: fully convolutional generator and Pix2Pix translation. It is shown that both the utilized approaches can well predict the stress distributions at the designated loading stages with high accuracy. / M.S. / Fiber-reinforced composites are material types with excellent mechanical performance. They form the major material in the construction of space shuttles, aircraft, fancy cars, etc., the structures that are designed to be lightweight and at the same time extremely stiff and strong. Due to the broad application, especially in the sensitives industries, fiber-reinforced composites have always been a subject of meticulous research studies. The research studies to better understand the mechanical behavior of these composites has to be conducted on the micro-scale. Since the experimental studies on micro-scale are expensive and extremely limited, numerical simulations are normally adopted. Numerical simulations, however, are complex, time-consuming, and highly computationally expensive even when run on powerful supercomputers. Hence, this research aims to leverage artificial intelligence to reduce the complexity and computational cost associated with the existing high-fidelity simulation techniques. We propose a robust deep learning framework that can be used as a replacement for the conventional numerical simulations to predict important mechanical attributes of the fiber-reinforced composite materials on the micro-scale. The proposed framework is shown to have high accuracy in predicting complex phenomena including stress distributions at various stages of mechanical loading.
|
Page generated in 0.1202 seconds