• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 26
  • 3
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 43
  • 43
  • 16
  • 10
  • 10
  • 9
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Sample efficient reinforcement learning for biological sequence design

Nouri, Padideh 08 1900 (has links)
L’apprentissage par renforcement profond a mené à de nombreux résultats prometteurs dans l’apprentissage des jeux vidéo à partir de pixels, dans la robotique pour l’apprentissage de compétences généralisables et dans les soins de santé pour l’apprentissage de traitement dynamiques. Un obstacle demeure toutefois: celui du manque d’efficacité dans le nombre d’échantillons nécessaires pour obtenir de bons résultats. Pour résoudre ce problème, notre objectif est d’améliorer l’efficacité de l’apprentissage en améliorant les capacité d’acquisition de nouvelles données, un problème d’exploration. L’approche proposée consiste à : (1) Apprendre un ensemble diversifié d’environments (donnant lieu à un changement de dynamique) (2) Apprendre une politique capable de mieux s’adapter aux changements dans l’envi- ronnement, à l’aide du méta-apprentissage. Cette méthode peut avoir des impacts bénéfiques dans de nombreux problèmes du monde réel tels que la découverte de médicaments, dans laquelle nous sommes confrontés à un espace d’actions très grand. D’autant plus, la conception de nouvelles substances thérapeutiques qui sont fonctionnellement intéressantes nécessite une exploration efficace du paysage de la recherche. / Deep reinforcement learning has led to promising results in learning video games from pixels, robotics for learning generalizable skills, and healthcare for learning dynamic treatments. However, an obstacle remains the lack of efficiency in the number of samples required to achieve good results. To address this problem, our goal is to improve sample efficiency by improving the ability to acquire new data, an issue of exploration. The proposed approach is to: (1) Learn a diverse set of environments (resulting in a change of dynamics) (2) earn a policy that can better adapt to changes in the environment using meta-learning This method can benefit many real-world problems, such as drug discovery, where we face a large action space. Furthermore, designing new therapeutic substances that are functionally interesting requires efficient exploration of the research landscape
32

Application of Saliency Maps for Optimizing Camera Positioning in Deep Learning Applications

Wecke, Leonard-Riccardo Hans 05 January 2024 (has links)
In the fields of process control engineering and robotics, especially in automatic control, optimization challenges frequently manifest as complex problems with expensive evaluations. This thesis zeroes in on one such problem: the optimization of camera positions for Convolutional Neural Networks (CNNs). CNNs have specific attention points in images that are often not intuitive to human perception, making camera placement critical for performance. The research is guided by two primary questions. The first investigates the role of Explainable Artificial Intelligence (XAI), specifically GradCAM++ visual explanations, in Computer Vision for aiding in the evaluation of different camera positions. Building on this, the second question assesses a novel algorithm that leverages these XAI features against traditional black-box optimization methods. To answer these questions, the study employs a robotic auto-positioning system for data collection, CNN model training, and performance evaluation. A case study focused on classifying flow regimes in industrial-grade bioreactors validates the method. The proposed approach shows improvements over established techniques like Grid Search, Random Search, Bayesian optimization, and Simulated Annealing. Future work will focus on gathering more data and including noise for generalized conclusions.:Contents 1 Introduction 1.1 Motivation 1.2 Problem Analysis 1.3 Research Question 1.4 Structure of the Thesis 2 State of the Art 2.1 Literature Research Methodology 2.1.1 Search Strategy 2.1.2 Inclusion and Exclusion Criteria 2.2 Blackbox Optimization 2.3 Mathematical Notation 2.4 Bayesian Optimization 2.5 Simulated Annealing 2.6 Random Search 2.7 Gridsearch 2.8 Explainable A.I. and Saliency Maps 2.9 Flowregime Classification in Stirred Vessels 2.10 Performance Metrics 2.10.1 R2 Score and Polynomial Regression for Experiment Data Analysis 2.10.2 Blackbox Optimization Performance Metrics 2.10.3 CNN Performance Metrics 3 Methodology 3.1 Requirement Analysis and Research Hypothesis 3.2 Research Approach: Case Study 3.3 Data Collection 3.4 Evaluation and Justification 4 Concept 4.1 System Overview 4.2 Data Flow 4.3 Experimental Setup 4.4 Optimization Challenges and Approaches 5 Data Collection and Experimental Setup 5.1 Hardware Components 5.2 Data Recording and Design of Experiments 5.3 Data Collection 5.4 Post-Experiment 6 Implementation 6.1 Simulation Unit 6.2 Recommendation Scalar from Saliency Maps 6.3 Saliency Map Features as Guidance Mechanism 6.4 GradCam++ Enhanced Bayesian Optimization 6.5 Benchmarking Unit 6.6 Benchmarking 7 Results and Evaluation 7.1 Experiment Data Analysis 7.2 Recommendation Scalar 7.3 Benchmarking Results and Quantitative Analysis 7.3.1 Accuracy Results from the Benchmarking Process 7.3.2 Cumulative Results Interpretation 7.3.3 Analysis of Variability 7.4 Answering the Research Questions 7.5 Summary 8 Discussion 8.1 Critical Examination of Limitations 8.2 Discussion of Solutions to Limitations 8.3 Practice-Oriented Discussion of Findings 9 Summary and Outlook / Im Bereich der Prozessleittechnik und Robotik, speziell bei der automatischen Steuerung, treten oft komplexe Optimierungsprobleme auf. Diese Arbeit konzentriert sich auf die Optimierung der Kameraplatzierung in Anwendungen, die Convolutional Neural Networks (CNNs) verwenden. Da CNNs spezifische, für den Menschen nicht immer ersichtliche, Merkmale in Bildern hervorheben, ist die intuitive Platzierung der Kamera oft nicht optimal. Zwei Forschungsfragen leiten diese Arbeit: Die erste Frage untersucht die Rolle von Erklärbarer Künstlicher Intelligenz (XAI) in der Computer Vision zur Bereitstellung von Merkmalen für die Bewertung von Kamerapositionen. Die zweite Frage vergleicht einen darauf basierenden Algorithmus mit anderen Blackbox-Optimierungstechniken. Ein robotisches Auto-Positionierungssystem wird zur Datenerfassung und für Experimente eingesetzt. Als Lösungsansatz wird eine Methode vorgestellt, die XAI-Merkmale, insbesondere solche aus GradCAM++ Erkenntnissen, mit einem Bayesschen Optimierungsalgorithmus kombiniert. Diese Methode wird in einer Fallstudie zur Klassifizierung von Strömungsregimen in industriellen Bioreaktoren angewendet und zeigt eine gesteigerte performance im Vergleich zu etablierten Methoden. Zukünftige Forschung wird sich auf die Sammlung weiterer Daten, die Inklusion von verrauschten Daten und die Konsultation von Experten für eine kostengünstigere Implementierung konzentrieren.:Contents 1 Introduction 1.1 Motivation 1.2 Problem Analysis 1.3 Research Question 1.4 Structure of the Thesis 2 State of the Art 2.1 Literature Research Methodology 2.1.1 Search Strategy 2.1.2 Inclusion and Exclusion Criteria 2.2 Blackbox Optimization 2.3 Mathematical Notation 2.4 Bayesian Optimization 2.5 Simulated Annealing 2.6 Random Search 2.7 Gridsearch 2.8 Explainable A.I. and Saliency Maps 2.9 Flowregime Classification in Stirred Vessels 2.10 Performance Metrics 2.10.1 R2 Score and Polynomial Regression for Experiment Data Analysis 2.10.2 Blackbox Optimization Performance Metrics 2.10.3 CNN Performance Metrics 3 Methodology 3.1 Requirement Analysis and Research Hypothesis 3.2 Research Approach: Case Study 3.3 Data Collection 3.4 Evaluation and Justification 4 Concept 4.1 System Overview 4.2 Data Flow 4.3 Experimental Setup 4.4 Optimization Challenges and Approaches 5 Data Collection and Experimental Setup 5.1 Hardware Components 5.2 Data Recording and Design of Experiments 5.3 Data Collection 5.4 Post-Experiment 6 Implementation 6.1 Simulation Unit 6.2 Recommendation Scalar from Saliency Maps 6.3 Saliency Map Features as Guidance Mechanism 6.4 GradCam++ Enhanced Bayesian Optimization 6.5 Benchmarking Unit 6.6 Benchmarking 7 Results and Evaluation 7.1 Experiment Data Analysis 7.2 Recommendation Scalar 7.3 Benchmarking Results and Quantitative Analysis 7.3.1 Accuracy Results from the Benchmarking Process 7.3.2 Cumulative Results Interpretation 7.3.3 Analysis of Variability 7.4 Answering the Research Questions 7.5 Summary 8 Discussion 8.1 Critical Examination of Limitations 8.2 Discussion of Solutions to Limitations 8.3 Practice-Oriented Discussion of Findings 9 Summary and Outlook
33

Probing Human Category Structures with Synthetic Photorealistic Stimuli

Chang Cheng, Jorge 08 September 2022 (has links)
No description available.
34

<b>MODEL BASED TRANSFER LEARNING ACROSS NANOMANUFACTURING PROCESSES AND BAYESIAN OPTIMIZATION FOR ADVANCED MODELING OF MIXTURE DATA</b>

Yueyun Zhang (18183583) 24 June 2024 (has links)
<p dir="ltr">Broadly, the focus of this work is on efficient statistical estimation and optimization of data arising from experimental data, particularly motivated by nanomanufacturing experiments on the material tellurene. Tellurene is a novel material for transistors with reliable attributes that enhance the performance of electronics (e.g., nanochip). As a solution-grown product, two-dimensional (2D) tellurene can be manufactured through a scalable process at a low cost. There are three main throughlines to this work, data augmentation, optimization, and equality constraint, and three distinct methodological projects, each of which addresses a subset of these throughlines. For the first project, I apply transfer learning in the analysis of data from a new tellurene experiment (process B) using the established linear regression model from a prior experiment (process A) from a similar study to combine the information from both experiments. The key of this approach is to incorporate the total equivalent amounts (TEA) of a lurking variable (experimental process changes) in terms of an observed (base) factor that appears in both experimental designs into the prespecified linear regression model. The results of the experimental data are presented including the optimal PVP chain length for scaling up production through a larger autoclave size. For the second project, I develop a multi-armed bandit Bayesian optimization (BO) approach to incorporate the equality constraint that comes from a mixture experiment on tellurium nanoproduct and account for factors with categorical levels. A more complex optimization approach was necessitated by the experimenters’ use of a neural network regression model to estimate the response surface. Results are presented on synthetic data to validate the ability of BO to recover the optimal response and its efficiency is compared to Monte Carlo random sampling to understand the level of experimental design complexity at which BO begins to pay off. The third project examines the potential enhancement of parameter estimation by utilizing synthetic data generated through Generative Adversarial Networks (GANs) to augment experimental data coming from a mixture experiment with a small to moderate number of runs. Transfer learning shows high promise for aiding in tellurene experiments, BO’s value increases with the complexity of the experiment, and GANs performed poorly on smaller experiments introducing bias to parameter estimates.</p>
35

Optimization of convolutional neural networks for image classification using genetic algorithms and bayesian optimization

Rawat, Waseem 01 1900 (has links)
Notwithstanding the recent successes of deep convolutional neural networks for classification tasks, they are sensitive to the selection of their hyperparameters, which impose an exponentially large search space on modern convolutional models. Traditional hyperparameter selection methods include manual, grid, or random search, but these require expert knowledge or are computationally burdensome. Divergently, Bayesian optimization and evolutionary inspired techniques have surfaced as viable alternatives to the hyperparameter problem. Thus, an alternative hybrid approach that combines the advantages of these techniques is proposed. Specifically, the search space is partitioned into discrete-architectural, and continuous and categorical hyperparameter subspaces, which are respectively traversed by a stochastic genetic search, followed by a genetic-Bayesian search. Simulations on a prominent image classification task reveal that the proposed method results in an overall classification accuracy improvement of 0.87% over unoptimized baselines, and a greater than 97% reduction in computational costs compared to a commonly employed brute force approach. / Electrical and Mining Engineering / M. Tech. (Electrical Engineering)
36

Optimisation des paramètres de carbone de sol dans le modèle CLASSIC à l'aide d'optimisation bayésienne et d'observations

Gauthier, Charles 04 1900 (has links)
Le réservoir de carbone de sol est un élément clé du cycle global du carbone et donc du système climatique. Les sols et le carbone organique qu'ils contiennent constituent le plus grand réservoir de carbone des écosystèmes terrestres. Ce réservoir est également responsable du stockage d'une grande quantité de carbone prélevé de l'atmosphère par les plantes par la photosynthèse. C'est pourquoi les sols sont considérés comme une stratégie de mitigation viable pour réduire la concentration atmosphérique de CO2 dûe aux émissions globales de CO2 d'origine fossile. Malgré son importance, des incertitudes subsistent quant à la taille du réservoir global de carbone organique de sol et à ses dynamiques. Les modèles de biosphère terrestre sont des outils essentiels pour quantifier et étudier la dynamique du carbone organique de sol. Ces modèles simulent les processus biophysiques et biogéochimiques au sein des écosystèmes et peuvent également simuler le comportement futur du réservoir de carbone organique de sol en utilisant des forçages météorologiques appropriés. Cependant, de grandes incertitudes dans les projections faite par les modèles de biosphère terrestre sur les dynamiques du carbone organique de sol ont été observées, en partie dues au problème de l'équifinalité. Afin d'améliorer notre compréhension de la dynamique du carbone organique de sol, cette recherche visait à optimiser les paramètres du schéma de carbone de sol contenu dans le modèle de schéma canadien de surface terrestre incluant les cycles biogéochimiques (CLASSIC), afin de parvenir à une meilleure représentation de la dynamique du carbone organique de sol. Une analyse de sensibilité globale a été réalisée pour identifier lesquels parmis les 16 paramètres du schéma de carbone de sol, n'affectaient pas la simulation du carbone organique de sol et de la respiration du sol. L'analyse de sensibilité a utilisé trois sites de covariance des turbulences afin de représenter différentes conditions climatiques simulées par le schéma de carbone de sol et d'économiser le coût calculatoire de l'analyse. L'analyse de sensibilité a démontré que certains paramètres du schéma de carbone de sol ne contribuent pas à la variance des simulations du carbone organique de sol et de la respiration du sol. Ce résultat a permis de réduire la dimensionnalité du problème d'optimisation. Ensuite, quatre scénarios d'optimisation ont été élaborés sur la base de l'analyse de sensibilité, chacun utilisant un ensemble de paramètres. Deux fonctions coûts ont été utilisées pour l'optimisation de chacun des scénarios. L'optimisation a également démontré que la fonction coût utilisée avait un impact sur les ensembles de paramètres optimisés. Les ensembles de paramètres obtenus à partir des différents scénarios et fonctions coûts ont été comparés à des ensembles de données indépendants et à des estimations globales du carbone organique de sol à l'aide de métrique tel la racine de l'erreur quadratique moyenne et le bias, afin d'évaluer l'effet des ensembles de paramètres sur les simulations effectuées par le schéma de carbone de sol. Un ensemble de paramètres a surpassé les autres ensembles de paramètres optimisés ainsi que le paramétrage par défaut du modèle. Ce résultat a indiqué que la structure d'optimisation était en mesure de produire un ensemble de paramètres qui simulait des valeurs de carbone organique de sol et de respiration du sol qui étaient plus près des valeurs observées que le modèle CLASSIC par défaut, améliorant la représentation de la dynamique du carbone du sol. Cet ensemble de paramètres optimisés a ensuite été utilisé pour effectuer des simulations futures (2015-2100) de la dynamique du carbone organique de sol afin d'évaluer son impact sur les projections de CLASSIC. Les simulations futures ont montré que l'ensemble de paramètres optimisés simulait une quantité de carbone organique de sol 62 % plus élevée que l'ensemble de paramètres par défaut tout en simulant des flux de respiration du sol similaires. Les simulations futures ont également montré que les ensembles de paramètres optimisés et par défaut prévoyaient que le réservoir de carbone organique de sol demeurerait un puits de carbone net d'ici 2100 avec des sources nettes régionales. Cette étude a amélioré globalement la représentation de la dynamique du carbone organique de sol dans le schéma de carbone de sol de CLASSIC en fournissant un ensemble de paramètres optimisés. Cet ensemble de paramètres devrait permettre d'améliorer notre compréhension de la dynamique du carbone du sol. / The soil carbon pool is a vital component of the global carbon cycle and, therefore, the climate system. Soil organic carbon (SOC) is the largest carbon pool in terrestrial ecosystems. This pool stores a large quantity of carbon that plants have removed from the atmosphere through photosynthesis. Because of this, soils are considered a viable climate change mitigation strategy to lower the global atmospheric CO2 concentration that is presently being driven higher by anthropogenic fossil CO2 emissions. Despite its importance, there are still considerable uncertainties around the size of the global SOC pool and its response to changing climate. Terrestrial biosphere models (TBM) simulate the biogeochemical processes within ecosystems and are critical tools to quantify and study SOC dynamics. These models can also simulate the future behavior of SOC if carefully applied and given the proper meteorological forcings. However, TBM predictions of SOC dynamics have high uncertainties due in part to equifinality. To improve our understanding of SOC dynamics, this research optimized the parameters of the soil carbon scheme contained within the Canadian Land Surface Scheme Including Biogeochemical Cycles (CLASSIC), to better represent SOC dynamics. A global sensitivity analysis was performed to identify which of the 16 parameters of the soil carbon scheme did not affect simulated SOC stocks and soil respiration (Rsoil). The sensitivity analysis used observations from three eddy covariance sites for computational efficiency and to encapsulate the climate represented by the global soil carbon scheme. The sensitivity analysis revealed that some parameters of the soil carbon scheme did not contribute to the variance of simulated SOC and Rsoil. These parameters were excluded from the optimization which helped reduce the dimensionality of the optimization problem. Then, four optimization scenarios were created based on the sensitivity analysis, each using a different set of parameters to assess the impact the number of parameters included had on the optimization. Two different loss functions were used in the optimization to assess the impact of accounting for observational error. Comparing the optimal parameters between the optimizations performed using the different loss functions showed that the loss functions impacted the optimized parameter sets. To determine which optimized parameter set obtained by each loss function was most skillful, they were compared to independent data sets and global estimates of SOC, which were not used in the optimization using comparison metrics based on root-mean-square-deviation and bias. This study generated an optimal parameter set that outperformed the default parameterization of the model. This optimal parameter set was then applied in future simulations of SOC dynamics to assess its impact upon CLASSIC's future projections. These future simulations showed that the optimal parameter set simulated future global SOC content 62 % higher than the default parameter set while simulating similar Rsoil fluxes. The future simulations also showed that both the optimized and default parameter sets projected that the SOC pool would be a net sink by 2100 with regional net sources, notably tropical regions.
37

Far Field EM Side-Channel Attack Based on Deep Learning with Automated Hyperparameter Tuning

Liu, Keyi January 2021 (has links)
Side-channel attacks have become a realistic threat to the implementations of cryptographic algorithms. By analyzing the unintentional, side-channel leakage, the attacker is able to recover the secret of the target. Recently, a new type of side-channel leakage has been discovered, called far field EM emissions. Unlike attacks based on near field EM emissions or power consumption, the attack based on far field EM emissions is able to extract the secret key from the victim device of several meters distance. However, existing deep-learning attacks based far field EM commonly use a random or grid search method to optimize neural networks’ hyperparameters. Recently, an automated way for deep learning hyperparameter tuning based on Auto- Keras library, called AutoSCA framework, was applied to near-field EM attacks. In this work, we investigate if AutoSCA could help far field EM side-channel attacks. In our experiments, the target is a Bluetooth-5 supported Nordic Semiconductor nRF52832 development kit implementation of Advanced Encryption Standard (AES). Our experiments show that, by using a deep-learning model generated by the AutoSCA framework, we need 485 traces on average to recover a subkey from traces captured at 15 meters distance from the victim device without repeating each encryption. For the same conditions, the state-of-the-art method uses 510 traces. Furthermore, our model contains only 667,433 trainable parameters in total, implying that it requires roughly 9 times less training resources compared to the larger models in the previous work. / Angrepp på sidokanaler har blivit ett realistiskt hot mot implementeringen av kryptografiska algoritmer.Genom att analysera det oavsiktliga läckaget kan angriparen hitta hemligheten bakom målet.Nyligen har en ny typ av sidokanalläckage upptäckts, kallad fjärrfälts EM-utsläpp.Till skillnad från attacker baserade på nära fält EM- utsläpp eller energiförbrukning, kan attacken baserad på yttre fält EM-utsläpp extrahera den hemliga nyckeln från den skadade anordningen på flera meter avstånd.Men befintliga djupinlärningsattacker baserade på långt fält EM använder ofta en slumpmässig sökmetod för att optimera nervnätens hyperparametrar. Nyligen tillämpades ett automatiserat sätt för djupinlärning av hyperparametern baserad på Auto-Keras- bibliotek, AutoSCA- ramverket, vid EM-angrepp nära fältet.I det här arbetet undersöker vi om AutoSCA kan hjälpa till med EM-angrepp.I våra experiment är målet en Bluetooth-5-stödd nordisk semidirigent nR52832- utvecklingsutrustning för avancerad krypteringsstandard (AES).Våra experiment visar att genom att använda en djupinlärningsmodell skapad av AutoSCA-ramverket, behöver vi 485-spår i genomsnitt för att hämta en subnyckel från spår tagna på 15- meters avstånd från offrets apparat utan att upprepa varje kryptering.Under samma förhållanden använder den senaste metoden 510-spår.Dessutom innehåller vår modell bara 667,433-parametrar som totalt kan användas för utbildning, vilket innebär att det krävs ungefär nio gånger mindre utbildningsresurser jämfört med de större modellerna i det tidigare arbetet.
38

Bayesian Off-policy Sim-to-Real Transfer for Antenna Tilt Optimization

Larsson Forsberg, Albin January 2021 (has links)
Choosing the correct angle of electrical tilt in a radio base station is essential when optimizing for coverage and capacity. A reinforcement learning agent can be trained to make this choice. If the training of the agent in the real world is restricted or even impossible, alternative methods can be used. Training in simulation combined with an approximation of the real world is one option that comes with a set of challenges associated with the reality gap. In this thesis, a method based on Bayesian optimization is implemented to tune the environment in which domain randomization is performed to improve the quality of the simulation training. The results show that using Bayesian optimization to find a good subset of parameters works even when access to the real world is constrained. Two off- policy estimators based on inverse propensity scoring and direct method evaluation in combination with an offline dataset of previously collected cell traces were tested. The method manages to find an isolated subspace of the whole domain that optimizes the randomization while still giving good performance in the target domain. / Rätt val av elektrisk antennvinkel för en radiobasstation är avgörande när täckning och kapacitetsoptimering (eng. coverage and capacity optimization) görs för en förstärkningsinlärningsagent. Om träning av agenten i verkligheten är besvärlig eller till och med omöjlig att genomföra kan olika alternativa metoder användas. Simuleringsträning kombinerad med en skattningsmodell av verkligheten är ett alternativ som har olika utmaningar kopplade till klyftan mellan simulering och verkligheten (eng. reality gap). I denna avhandling implementeras en lösning baserad på Bayesiansk Optimering med syftet att anpassa miljön som domänrandomisering sker i för att förbättra kvaliteten på simuleringsträningen. Resultatet visar att Bayesiansk Optimering kan användas för att hitta ett urval av fungerande parametrar även när tillgången till den faktiska verkligheten är begränsad. Två skattningsmodeller baserade på invers propensitetsviktning och direktmetodutvärdering i kombination med ett tidigare insamlat dataset av nätverksdata testades. Den tillämpade metoden lyckas hitta ett isolerat delrum av parameterrymden som optimerar randomiseringen samtidigt som prestationen i verkligheten hålls på en god nivå.
39

Efficient Sequential Sampling for Neural Network-based Surrogate Modeling

Pavankumar Channabasa Koratikere (15353788) 27 April 2023 (has links)
<p>Gaussian Process Regression (GPR) is a widely used surrogate model in efficient global optimization (EGO) due to its capability to provide uncertainty estimates in the prediction. The cost of creating a GPR model for large data sets is high. On the other hand, neural network (NN) models scale better compared to GPR as the number of samples increase. Unfortunately, the uncertainty estimates for NN prediction are not readily available. In this work, a scalable algorithm is developed for EGO using NN-based prediction and uncertainty (EGONN). Initially, two different NNs are created using two different data sets. The first NN models the output based on the input values in the first data set while the second NN models the prediction error of the first NN using the second data set. The next infill point is added to the first data set based on criteria like expected improvement or prediction uncertainty. EGONN is demonstrated on the optimization of the Forrester function and a constrained Branin function and is compared with EGO. The convergence criteria is based on the maximum number of infill points in both cases. The algorithm is able to reach the optimum point within the given budget. The EGONN is extended to handle constraints explicitly and is utilized for aerodynamic shape optimization of the RAE 2822 airfoil in transonic viscous flow at a free-stream Mach number of 0.734 and a Reynolds number of 6.5 million. The results obtained from EGONN are compared with the results from gradient-based optimization (GBO) using adjoints. The optimum shape obtained from EGONN is comparable to the shape obtained from GBO and is able to eliminate the shock. The drag coefficient is reduced from 200 drag counts to 114 and is close to 110 drag counts obtained from GBO. The EGONN is also extended to handle uncertainty quantification (uqEGONN) using prediction uncertainty as an infill method. The convergence criteria is based on the relative change of summary statistics such as mean and standard deviation of an uncertain quantity. The uqEGONN is tested on Ishigami function with an initial sample size of 100 samples and the algorithm terminates after 70 infill points. The statistics obtained from uqEGONN (using only 170 function evaluations) are close to the values obtained from directly evaluating the function one million times. uqEGONN is demonstrated on to quantifying the uncertainty in the airfoil performance due to geometric variations. The algorithm terminates within 100 computational fluid dynamics (CFD) analyses and the statistics obtained from the algorithm are close to the one obtained from 1000 direct CFD based evaluations.</p>
40

Computationally Efficient Explainable AI: Bayesian Optimization for Computing Multiple Counterfactual Explanantions / Beräkningsmässigt Effektiv Förklarbar AI: Bayesiansk Optimering för Beräkning av Flera Motfaktiska Förklaringar

Sacchi, Giorgio January 2023 (has links)
In recent years, advanced machine learning (ML) models have revolutionized industries ranging from the healthcare sector to retail and E-commerce. However, these models have become increasingly complex, making it difficult for even domain experts to understand and retrace the model's decision-making process. To address this challenge, several frameworks for explainable AI have been proposed and developed. This thesis focuses on counterfactual explanations (CFEs), which provide actionable insights by informing users how to modify inputs to achieve desired outputs. However, computing CFEs for a general black-box ML model is computationally expensive since it hinges on solving a challenging optimization problem. To efficiently solve this optimization problem, we propose using Bayesian optimization (BO), and introduce the novel algorithm Separated Bayesian Optimization (SBO). SBO exploits the formulation of the counterfactual function as a composite function. Additionally, we propose warm-starting SBO, which addresses the computational challenges associated with computing multiple CFEs. By decoupling the generation of a surrogate model for the black-box model and the computation of specific CFEs, warm-starting SBO allows us to reuse previous data and computations, resulting in computational discounts and improved efficiency for large-scale applications. Through numerical experiments, we demonstrate that BO is a viable optimization scheme for computing CFEs for black-box ML models. BO achieves computational efficiency while maintaining good accuracy. SBO improves upon this by requiring fewer evaluations while achieving accuracies comparable to the best conventional optimizer tested. Both BO and SBO exhibit improved capabilities in handling various classes of ML decision models compared to the tested baseline optimizers. Finally, Warm-starting SBO significantly enhances the performance of SBO, reducing function evaluations and errors when computing multiple sequential CFEs. The results indicate a strong potential for large-scale industry applications. / Avancerade maskininlärningsmodeller (ML-modeller) har på senaste åren haft stora framgångar inom flera delar av näringslivet, med allt ifrån hälso- och sjukvårdssektorn till detaljhandel och e-handel. I jämn takt med denna utveckling har det dock även kommit en ökad komplexitet av dessa ML-modeller vilket nu lett till att även domänexperter har svårigheter med att förstå och tolka modellernas beslutsprocesser. För att bemöta detta problem har flertalet förklarbar AI ramverk utvecklats. Denna avhandling fokuserar på kontrafaktuella förklaringar (CFEs). Detta är en förklaringstyp som anger för användaren hur denne bör modifiera sin indata för att uppnå ett visst modellbeslut. För en generell svarta-låda ML-modell är dock beräkningsmässigt kostsamt att beräkna CFEs då det krävs att man löser ett utmanande optimeringsproblem. För att lösa optimeringsproblemet föreslår vi användningen av Bayesiansk Optimering (BO), samt presenterar den nya algoritmen Separated Bayesian Optimization (SBO). SBO utnyttjar kompositionsformuleringen av den kontrafaktuella funktionen. Vidare, utforskar vi beräkningen av flera sekventiella CFEs för vilket vi presenterar varm-startad SBO. Varm-startad SBO lyckas återanvända data samt beräkningar från tidigare CFEs tack vare en separation av surrogat-modellen för svarta-låda ML-modellen och beräkningen av enskilda CFEs. Denna egenskap leder till en minskad beräkningskostnad samt ökad effektivitet för storskaliga tillämpningar.  I de genomförda experimenten visar vi att BO är en lämplig optimeringsmetod för att beräkna CFEs för svarta-låda ML-modeller tack vare en god beräknings effektivitet kombinerat med hög noggrannhet. SBO presterade ännu bättre med i snitt färre funktionsutvärderingar och med fel nivåer jämförbara med den bästa testade konventionella optimeringsmetoden. Både BO och SBO visade på bättre kapacitet att hantera olika klasser av ML-modeller än de andra testade metoderna. Slutligen observerade vi att varm-startad SBO gav ytterligare prestandaökningar med både minskade funktionsutvärderingar och fel när flera CFEs beräknades. Dessa resultat pekar på stor potential för storskaliga tillämpningar inom näringslivet.

Page generated in 0.0308 seconds