Global ETD Search

51	Improving Image Realism by Traversing the GAN Latent Space Wen, Jeffrey 25 July 2022 (has links) No description available. Electrical Engineering Generative Adversarial Networks GAN Latent Space Image Generation
52	On Transferability of Adversarial Examples on Machine-Learning-Based Malware Classifiers Hu, Yang 12 May 2022 (has links) The use of Machine Learning for malware detection is essential to counter the massive growth in malware types compared with the traditional signature-based detection system. However, machine learning models could also be extremely vulnerable and sensible to transferable adversarial example (AE) attacks. The transfer AE attack does not require extra information from the victim model such as gradient information. Researchers explore mainly 2 lines of transfer-based adversarial example attacks: ensemble models and ensemble samples. \\ Although comprehensive innovations and progress have been achieved in transfer AE attacks, few works have investigated how these techniques perform in malware data. Besides, generating adversarial examples on an android APK file is not as easy and convenient as it is on image data since the generated AE of malware should also remain its functionality and executability after perturbation. Therefore, it is urgent to validate whether previous methodologies could still have their effect on malware considering the differences compared to image data. \\ In this thesis, we first have a thorough literature review for the AE attacks on malware data and general transfer AE attacks. Then we design our algorithm for the transfer AE attack. We formulate the optimization problem based on the intuition that the contribution evenness of features towards the final prediction result is highly correlated to the AE transferability. We then solve the optimization problem by gradient descent and evaluate it through extensive experiments. Implementing and experimenting with the state-of-the-art AE algorithms and transferability enhancement techniques, we analyze and summarize the weaknesses and strengths of each method. / Master of Science / Machine learning models have been widely applied to malware detection systems in recent years due to the massive growth in malware types. However, these models are vulnerable to adversarial attacks. Malicious attackers can add some small imperceptible perturbations to the original testing samples and mislead the classification results at a very low cost. Research on adversarial attacks would help us gain a better understanding of the attacker's side and inspire defenses against them. Among all adversarial attacks, the transfer-based adversarial example attack is one of the most devastating attacks since it does not require extra information from the targeted victim model such as gradient information or query from the model. Although plenty of researchers has explored the transfer AE attack lately, few works focus on malware (e.g., Android) data. Compared with image data, perturbing malware is more complicated and challenging since the generated adversarial examples of malware need to remain functionality and executability. To validate how transfer AE attack methods perform on malware, we implement the state-of-the-art (SOTA) works in this thesis and experiment with them on real Android data. Besides, we develop a new transfer-based AE attack method based on the contribution of each feature for generating AE. We then do comprehensive evaluations and draw comparisons between SOTA works and our proposed method. malware detection adversarial example attack transferability machine learning
53	Accelerating Conceptual Design Analysis of Marine Vehicles through Deep Learning Jones, Matthew Cecil 02 May 2019 (has links) Evaluation of the flow field imparted by a marine vehicle reveals the underlying efficiency and performance. However, the relationship between precise design features and their impact on the flow field is not well characterized. The goal of this work is first, to investigate the thermally-stratified near field of a self-propelled marine vehicle to identify the significance of propulsion and hull-form design decisions, and second, to develop a functional mapping between an arbitrary vehicle design and its associated flow field to accelerate the design analysis process. The unsteady Reynolds-Averaged Navier-Stokes equations are solved to compute near-field wake profiles, showing good agreement to experimental data and providing a balance between simulation fidelity and numerical cost, given the database of cases considered. Machine learning through convolutional networks is employed to discover the relationship between vehicle geometries and their associated flow fields with two distinct deep-learning networks. The first network directly maps explicitly-specified geometric design parameters to their corresponding flow fields. The second network considers the vehicle geometries themselves as tensors of geometric volume fractions to implicitly-learn the underlying parameter space. Once trained, both networks effectively generate realistic flow fields, accelerating the design analysis from a process that takes days to one that takes a fraction of a second. The implicit-parameter network successfully learns the underlying parameter space for geometries within the scope of the training data, showing comparable performance to the explicit-parameter network. With additions to the size and variability of the training database, this network has the potential to abstractly generalize the design space for arbitrary geometric inputs, even those beyond the scope of the training data. / Doctor of Philosophy / Evaluation of the flow field of a marine vehicle reveals the underlying performance, however, the exact relationship between design features and their impact on the flow field is not well established. The goal of this work is first, to investigate the flow surrounding a self–propelled marine vehicle to identify the significance of various design decisions, and second, to develop a functional relationship between an arbitrary vehicle design and its flow field, thereby accelerating the design analysis process. Near–field wake profiles are computed through simulation, showing good agreement to experimental data. Machine learning is employed to discover the relationship between vehicle geometries and their associated flow fields with two distinct approaches. The first approach directly maps explicitly–specified geometric design parameters to their corresponding flow fields. The second approach considers the vehicle geometries themselves to implicitly–learn the underlying relationships. Once trained, both approaches generate a realistic flow field corresponding to a user–provided vehicle geometry, accelerating the design analysis from a multi–day process to one that takes a fraction of a second. The implicit–parameter approach successfully learns from the underlying geometric features, showing comparable performance to the explicit–parameter approach. With a larger and more–diverse training database, this network has the potential to abstractly learn the design space relationships for arbitrary marine vehicle geometries, even those beyond the scope of the training database. near wake Machine learning deep learning adversarial network OpenFOAM
54	Synthetic Electronic Medical Record Generation using Generative Adversarial Networks Beyki, Mohammad Reza 13 August 2021 (has links) It has been a while that computers have replaced our record books, and medical records are no exception. Electronic Health Records (EHR) are digital version of a patient's medical records. EHRs are available to authorized users, and they contain the medical records of the patient, which should help doctors understand a patient's condition quickly. In recent years, Deep Learning models have proved their value and have become state-of-the-art in computer vision, natural language processing, speech and other areas. The private nature of EHR data has prevented public access to EHR datasets. There are many obstacles to create a deep learning model with EHR data. Because EHR data are primarily consisting of huge sparse matrices, these challenges are mostly unique to this field. Due to this, research in this area is limited, and we can improve existing research substantially. In this study, we focus on high-performance synthetic data generation in EHR datasets. Artificial data generation can help reduce privacy leakage for dataset owners as it is proven that de-identification methods are prone to re-identification attacks. We propose a novel approach we call Improved Correlation Capturing Wasserstein Generative Adversarial Network (SCorGAN) to create EHR data. This work, leverages Deep Convolutional Neural Networks to extract and understand spatial dependencies in EHR data. To improve our model's performance, we focus on our Deep Convolutional AutoEncoder to better map our real EHR data to our latent space where we train the Generator. To assess our model's performance, we demonstrate that our generative model can create excellent data that are statistically close to the input dataset. Additionally, we evaluate our synthetic dataset against the original data using our previous work that focused on GAN Performance Evaluation. This work is publicly available at https://github.com/mohibeyki/SCorGAN / Master of Science / Artificial Intelligence (AI) systems have improved greatly in recent years. They are being used to understand all kinds of data. A practical use case for AI systems is to leverage their power to identify illnesses and find correlations between different conditions. To train AI and Machine Learning systems, we need to feed them huge datasets, and in the training process, we need to guide them so that they learn different features in our data. The more data an intelligent system has seen, the better it performs. However, health records are private, and we cannot share real people's health records with the public, whether they are a researcher or not. This study provides a novel approach to synthetic data generation that others can use with intelligent systems. Then these systems can work with actual health records can give us accurate feedback on people's health conditions. We then show that our synthetic dataset is a good substitute for real datasets to train intelligent systems. Lastly, we present an intelligent system that we have trained using synthetic datasets to identify illnesses in a real dataset with high accuracy and precision. Deep learning (Machine learning) Healthcare Generative Adversarial Networks
55	Privacy-Preserving Synthetic Medical Data Generation with Deep Learning Torfi, Amirsina 26 August 2020 (has links) Deep learning models demonstrated good performance in various domains such as ComputerVision and Natural Language Processing. However, the utilization of data-driven methods in healthcare raises privacy concerns, which creates limitations for collaborative research. A remedy to this problem is to generate and employ synthetic data to address privacy concerns. Existing methods for artificial data generation suffer from different limitations, such as being bound to particular use cases. Furthermore, their generalizability to real-world problems is controversial regarding the uncertainties in defining and measuring key realistic characteristics. Hence, there is a need to establish insightful metrics (and to measure the validity of synthetic data), as well as quantitative criteria regarding privacy restrictions. We propose the use of Generative Adversarial Networks to help satisfy requirements for realistic characteristics and acceptable values of privacy metrics, simultaneously. The present study makes several unique contributions to synthetic data generation in the healthcare domain. First, we propose a novel domain-agnostic metric to evaluate the quality of synthetic data. Second, by utilizing 1-D Convolutional Neural Networks, we devise a new approach to capturing the correlation between adjacent diagnosis records. Third, we employ ConvolutionalAutoencoders for creating a robust and compact feature space to handle the mixture of discrete and continuous data. Finally, we devise a privacy-preserving framework that enforcesRényi differential privacy as a new notion of differential privacy. / Doctor of Philosophy / Computers programs have been widely used for clinical diagnosis but are often designed with assumptions limiting their scalability and interoperability. The recent proliferation of abundant health data, significant increases in computer processing power, and superior performance of data-driven methods enable a trending paradigm shift in healthcare technology. This involves the adoption of artificial intelligence methods, such as deep learning, to improve healthcare knowledge and practice. Despite the success in using deep learning in many different domains, in the healthcare field, privacy challenges make collaborative research difficult, as working with data-driven methods may jeopardize patients' privacy. To overcome these challenges, researchers propose to generate and utilize realistic synthetic data that can be used instead of real private data. Existing methods for artificial data generation are limited by being bound to special use cases. Furthermore, their generalizability to real-world problems is questionable. There is a need to establish valid synthetic data that overcomes privacy restrictions and functions as a real-world analog for healthcare deep learning data training. We propose the use of Generative Adversarial Networks to simultaneously overcome the realism and privacy challenges associated with healthcare data. Deep learning healthcare synthetic data generation generative adversarial networks privacy.
56	Latent Walking Techniques for Conditioning GAN-Generated Music Eisenbeiser, Logan Ryan 21 September 2020 (has links) Artificial music generation is a rapidly developing field focused on the complex task of creating neural networks that can produce realistic-sounding music. Generating music is very difficult; components like long and short term structure present time complexity, which can be difficult for neural networks to capture. Additionally, the acoustics of musical features like harmonies and chords, as well as timbre and instrumentation require complex representations for a network to accurately generate them. Various techniques for both music representation and network architecture have been used in the past decade to address these challenges in music generation. The focus of this thesis extends beyond generating music to the challenge of controlling and/or conditioning that generation. Conditional generation involves an additional piece or pieces of information which are input to the generator and constrain aspects of the results. Conditioning can be used to specify a tempo for the generated song, increase the density of notes, or even change the genre. Latent walking is one of the most popular techniques in conditional image generation, but its effectiveness on music-domain generation is largely unexplored. This paper focuses on latent walking techniques for conditioning the music generation network MuseGAN and examines the impact of this conditioning on the generated music. / Master of Science / Artificial music generation is a rapidly developing field focused on the complex task of creating neural networks that can produce realistic-sounding music. Beyond simply generating music lies the challenge of controlling or conditioning that generation. Conditional generation can be used to specify a tempo for the generated song, increase the density of notes, or even change the genre. Latent walking is one of the most popular techniques in conditional image generation, but its effectiveness on music-domain generation is largely unexplored, especially for generative adversarial networks (GANs). This paper focuses on latent walking techniques for conditioning the music generation network MuseGAN and examines the impact and effectiveness of this conditioning on the generated music. Music Generation Latent Walking Conditional Generation Generative Adversarial Network
57	Image-based Process Monitoring via Generative Adversarial Autoencoder with Applications to Rolling Defect Detection January 2019 (has links) abstract: Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and complex spatial structures. Recent advancement of the unsupervised deep models such as a generative adversarial network (GAN) and generative adversarial autoencoder (AAE) has enabled to learn the complex spatial structures automatically. Inspired by this advancement, we propose an anomaly detection framework based on the AAE for unsupervised anomaly detection for images. AAE combines the power of GAN with the variational autoencoder, which serves as a nonlinear dimension reduction technique with regularization from the discriminator. Based on this, we propose a monitoring statistic efficiently capturing the change of the image data. The performance of the proposed AAE-based anomaly detection algorithm is validated through a simulation study and real case study for rolling defect detection. / Dissertation/Thesis / Masters Thesis Industrial Engineering 2019 Industrial engineering Information technology Computer science adversarial autoencoder anomaly detection generative adversarial networks machine learning statistic unsupervised learning
58	Leveraging Synthetic Images with Domain-Adversarial Neural Networks for Fine-Grained Car Model Classification Smith, Dayyan January 2021 (has links) Supervised learning methods require vast amounts of annotated images to successfully train an image classifier. Acquiring the necessary annotated images is costly. The increased availability of photorealistic computer generated images that are annotated automatically begs the question under which conditions it is possible to leverage this synthetic data during training. We investigate the conditions that make it possible to leverage computer generated renders of car models for fine-grained car model classification. / Övervakade inlärningsmetoder kräver stora mängder kommenterade bilder för att framgångsrikt träna en bildklassificator. Det är kostsamt att skaffa de nödvändiga bilderna med kommentarer. Den ökade tillgången till fotorealistiska datorgenererade bilder som kommenteras automatiskt väcker frågan om under vilka förhållanden det är möjligt att utnyttja dessa syntetiska data vid träning. Vi undersöker vilka villkor som gör det möjligt att utnyttja datorgenererade renderingar av bilmodeller för finkornig klassificering av bilmodeller. domain generalization synthetic images domain adversarial neural networks domängeneralisering syntetiska bilder domän adversarial neural nätverken Computer and Information Sciences Data- och informationsvetenskap
59	Scenario Generation for Stress Testing Using Generative Adversarial Networks : Deep Learning Approach to Generate Extreme but Plausible Scenarios Gustafsson, Jonas, Jonsson, Conrad January 2023 (has links) Central Clearing Counterparties play a crucial role in financial markets, requiring robust risk management practices to ensure operational stability. A growing emphasis on risk analysis and stress testing from regulators has led to the need for sophisticated tools that can model extreme but plausible market scenarios. This thesis presents a method leveraging Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to construct an independent scenario generator capable of modeling and generating return distributions for financial markets. The developed method utilizes two primary components: the WGAN-GP model and a novel scenario selection strategy. The WGAN-GP model approximates the multivariate return distribution of stocks, generating plausible return scenarios. The scenario selection strategy employs lower and upper bounds on Euclidean distance calculated from the return vector to identify, and select, extreme scenarios suitable for stress testing clearing members' portfolios. This approach enables the extraction of extreme yet plausible returns. This method was evaluated using 25 years of historical stock return data from the S&P 500. Results demonstrate that the WGAN-GP model effectively approximates the multivariate return distribution of several stocks, facilitating the generation of new plausible returns. However, the model requires extensive training to fully capture the tails of the distribution. The Euclidean distance-based scenario selection strategy shows promise in identifying extreme scenarios, with the generated scenarios demonstrating comparable portfolio impact to historical scenarios. These results suggest that the proposed method offers valuable tools for Central Clearing Counterparties to enhance their risk management. / Centrala motparter spelar en avgörande roll i dagens finansmarknad, vilket innebär att robusta riskhanteringsrutiner är nödvändiga för att säkerställa operativ stabilitet. Ökande regulatoriskt tryck för riskanalys och stresstestning från tillsynsmyndigheter har lett till behovet av avancerade verktyg som kan modellera extrema men troliga marknadsscenarier. I denna uppsats presenteras en metod som använder Wasserstein Generative Adversarial Networks med Gradient Penalty (WGAN-GP) för att skapa en oberoende scenariogenerator som kan modellera och generera avkastningsfördelningar för finansmarknader. Den framtagna metoden består av två huvudkomponenter: WGAN-GP-modellen och en scenariourvalstrategi. WGAN-GP-modellen approximerar den multivariata avkastningsfördelningen för aktier och genererar möjliga avkastningsscenarier. Urvalsstrategin för scenarier använder nedre och övre gränser för euklidiskt avstånd, beräknat från avkastningsvektorn, för att identifiera och välja extrema scenarier som kan användas för att stresstesta clearingmedlemmars portföljer. Denna strategi gör det möjligt att erhålla nya extrema men troliga avkastningar. Metoden utvärderas med 25 års historisk aktieavkastningsdata från S&P 500. Resultaten visar att WGAN-GP-modellen effektivt kan approximera den multivariata avkastningsfördelningen för flera aktier och därmed generera nya möjliga avkastningar. Modellen kan dock kräva en omfattande mängd träningscykler (epochs) för att fullt ut fånga fördelningens svansar. Scenariurvalet baserat på euklidiskt avstånd visade lovande resultat som ett urvalskriterium för extrema scenarier. De genererade scenarierna visar en jämförbar påverkan på portföljer i förhållande till de historiska scenarierna. Dessa resultat tyder på att den föreslagna metoden kan erbjuda värdefulla verktyg för centrala motparter att förbättra sin riskhantering. Machine Learning Generative Adversarial Network (GAN) Scenario Generation Stress Testing Central Counterparty Clearing Mathematics Matematik
60	A Graybox Defense Through Bootstrapping Deep Neural Network Kirsen L Sullivan (14105763) 11 November 2022 (has links) <p>Building a robust deep neural network (DNN) framework turns out to be a very difficult task as adaptive attacks are developed that break a robust DNN strategy. In this work we first study the bootstrap distribution of DNN weights and biases. We bootstrap three DNN models: a simple three layer convolutional neural network (CNN), VGG16 with 13 convolutional layers and 3 fully connected layers, and Inception v3 with 42 layers. Both VGG16 and Inception v3 are trained on CIFAR10 in order for bootstrapping networks to converge. We then compare the bootstrap NN parameter distributions with those from training DNN with different random initial seeds. We discover that the bootstrap DNN parameter distributions change as the DNN model size increases. And the bootstrap DNN parameter distributions are very close to those obtained from training with different random initial seeds. The bootstrap DNN parameter distributions are used to create a graybox defense strategy. We randomize a certain percentage of the weights of the first convolutional layers of a DNN model, and create a random ensemble of DNNs. Based on one trained DNN, we have infinitely many random DNN ensembles. The adaptive attacks lose the target. A random DNN ensemble is resilient to the adversarial attacks and maintains performance on clean data.</p> Proteomics and metabolomics Adversarial machine learning Computational statistics Bootstrapping neural networks convolutional neural networks adversarial machine learning CNN metabolomics

Search results