1 |
Budget-constrained experimental optimizationRoshandelpoor, Athar 27 May 2021 (has links)
Many problems of design and operation in science and engineering can be formulated as optimization of a properly defined performance/objective function over a design space. This thesis considers optimization problems where information about the performance function can be obtained only through experimentation/function evaluation, in other words, optimization of black box functions. Furthermore, it is assumed that the optimization is performed with limited budget, namely, where only a limited number of function evaluations are feasible.
Two classes of optimization approaches are considered. The first, consisting of Design of Experiment (DOE) and Response Surface Methodology (RSM), explores the design space locally by identifying directions of improvement and incrementally moving towards the optimum. The second, referred to as Bayesian Optimization (BO), corresponds to a global search of the design space based on a stochastic model of the function over the design space that is updated after each experimentation/function evaluation.
Two independent projects related to the above optimization approaches are reported in the thesis. The first, the result of a collaborative effort with experimental and computational material scientists, involves adaptations of the above approaches in order to solve two specific new materials development projects. The goal of the first project was to develop an integrated computational-statistical-experimental methodology for calibration of an activated carbon adsorption bed. The second project consisted of the application and modification of existing DOE approaches to a highly data limited environment.
The second part consists of a new contribution to the methodology of Bayesian Optimization (BO) by significantly generalizing a non-myopic approach to BO. Different BO algorithms vary based on their choice of stochastic model of the unknown objective function, referred to as the surrogate model, and that of the so-called acquisition function, which often represents an expected utility of sampling at various points of the design space. Various myopic BO approaches which evaluate the benefit of taking only a single sample from the objective function have been considered in the literature. More recently, a number of non-myopic approaches have been proposed that go beyond evaluating the benefit of a single sample. In this thesis, a non-myopic approach/algorithm, referred to as z* policy, is considered that takes a different approach to evaluating the benefits of sampling. The resulting search approach is motivated by a non-myopic index policy in a sequential sampling problem that is shown to be optimal in a non-adaptive setting. An analysis of the z* policy is presented and it is placed within the broader context of non-myopic policies. Finally, using empirical evaluations, it is shown that in some instances the z* policy outperforms a number of other commonly used myopic and non-myopic policies. / 2023-11-30T00:00:00Z
|
2 |
Worlds Collide through Gaussian Processes: Statistics, Geoscience and Mathematical ProgrammingChristianson, Ryan Beck 04 May 2023 (has links)
Gaussian process (GP) regression is the canonical method for nonlinear spatial modeling among the statistics and machine learning communities. Geostatisticians use a subtly different technique known as kriging. I shall highlight key similarities and differences between GPs and kriging through the use of large scale gold mining data. Most importantly GPs are largely hands-off, automatically learning from the data whereas kriging requires an expert human in the loop to guide analysis. To emphasize this, I show an imputation method for left censored values frequently seen in mining data. Oftentimes geologists ignore censored values due to the difficulty of imputing with kriging, but GPs execute imputation with relative ease leading to better estimates of the gold surface. My hope is that this research can serve as a springboard to encourage the mining community to consider using GPs over kriging for diverse utility after GP model fitting. Another common use of GPs that would be inefficient for kriging is Bayesian Optimization (BO). Traditionally BO is designed to find a global optima by sequentially sampling from a function of interest using an acquisition function. When two or more local or global optima of the function of interest have similar objective values, it often makes some sense to target the more "robust" solution with a wider domain of attraction. However, traditional BO weighs these solutions the same, favoring whichever has a slightly better objective value. By combining the idea of expected improvement (EI) from the BO community with mathematical programming's concept of an adversary, I introduce a novel algorithm to target robust solutions called robust expected improvement (REI). The adversary penalizes "peaked" areas of the objective function making those values appear less desirable. REI performs acquisitions using EI on the adversarial space yielding data sets focused on the robust solution that exhibit EI's already proven excellent balance of exploration and exploitation. / Doctor of Philosophy / Since its origins in the 1940's, spatial statistics modeling has adapted to fit different communities. The geostatistics community developed with an emphasis on modeling mining operations and has further evolved to cover a slew of different applications largely focused on two or three physical dimensions. The computer experiments community developed later when these physical experiments started moving into the virtual realm with advances in computer technology. While birthed from the same foundation, computer experimenters often look at ten or sometimes even higher dimension problems. Due to these differences among others, each community tailored their methods to best fit their common problems. My research compares the modern instantiations of the differing methodology on two sets of real gold mining data. Ultimately, I prefer the computer experiments methods for their ease of adaptation to downstream tasks at no cost to model performance. A statistical model is almost never a standalone development; it is created with a specific goal in mind. The first case I show of this is "imputation" of mining data. Mining data often have a detection threshold such that any observation with very small mineral concentrations are recorded at the threshold. Frequently, geostatisticians simply throw out these observations because they cause problems in modeling. Statisticians try to use the information that there is a low concentration combined with the rest of the fully observed data to derive a best guess at the concentration of thresholded locations. Under the geostatistics framework, this is cumbersome, but the computer experiments community consider imputation an easy extension. Another common model task is creating an experiment to best learn a surface. The surface may be a gold deposit on Earth or an unknown virtual function or anything measurable really. To do this, computer experimenters often use "active learning" by sampling one point at a time, using that point to generate a better informed model which suggests a new point to sample, repeating until a satisfactory number of points are sampled. Geostatisticians often prefer "one-shot" experiments by deciding all samples prior to collecting any. Thus the geostatistics framework is not appropriate for active learning. Active learning tries to find the "best" location of the surface with either the maximum or minimum response. I adapt this problem to redefine best to find a "robust" location where the response does not change much even if the location is not perfectly specified. As an example, consider setting operating conditions for a factory. If locations produce a similar amount of product, but one needs an exact pressure setting or else it blows up the factory, the other is certainly preferred. To design experiments to find robust locations, I borrow ideas from the mathematical programming community to develop a novel method for robust active learning.
|
3 |
Gaussian Processes for Power System Monitoring, Optimization, and PlanningJalali, Mana 26 July 2022 (has links)
The proliferation of renewables, electric vehicles, and power electronic devices calls for innovative approaches to learn, optimize, and plan the power system. The uncertain and volatile nature of the integrated components necessitates using swift and probabilistic solutions.
Gaussian process regression is a machine learning paradigm that provides closed-form predictions with quantified uncertainties. The key property of Gaussian processes is the natural ability to integrate the sensitivity of the labels with respect to features, yielding improved accuracy. This dissertation tailors Gaussian process regression for three applications in power systems. First, a physics-informed approach is introduced to infer the grid dynamics using synchrophasor data with minimal network information. The suggested method is useful for a wide range of applications, including prediction, extrapolation, and anomaly detection. Further, the proposed framework accommodates heterogeneous noisy measurements with missing entries. Second, a learn-to-optimize scheme is presented using Gaussian process regression that predicts the optimal power flow minimizers given grid conditions.
The main contribution is leveraging sensitivities to expedite learning and achieve data efficiency without compromising computational efficiency. Third, Bayesian optimization is applied to solve a bi-level minimization used for strategic investment in electricity markets.
This method relies on modeling the cost of the outer problem as a Gaussian process and is applicable to non-convex and hard-to-evaluate objective functions. The designed algorithm shows significant improvement in speed while attaining a lower cost than existing methods. / Doctor of Philosophy / The proliferation of renewables, electric vehicles, and power electronic devices calls for innovative approaches to learn, optimize, and plan the power system. The uncertain and volatile nature of the integrated components necessitates using swift and probabilistic solutions.
This dissertation focuses on three practically important problems stemming from the power system modernization. First, a novel approach is proposed that improves power system monitoring, which is the first and necessary step for the stable operation of the network.
The suggested method applies to a wide range of applications and is adaptable to use heterogeneous and noisy measurements with missing entries. The second problem focuses on predicting the minimizers of an optimization task. Moreover, a computationally efficient framework is put forth to expedite this process. The third part of this dissertation identifies investment portfolios for electricity markets that yield maximum revenue and minimum cost.
|
4 |
Bayesian optimization with empirical constraintsAzimi, Javad 05 September 2012 (has links)
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(���) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ��� 1 points to be evaluated. The results of those points are then added to the set of observations and the procedure is repeated until a stopping criterion is met. The goal is to optimize the function f(���) with a small number of experiment evaluations. While this problem has been extensively studied, most existing approaches ignored some real world constraints frequently encountered in practical applications. In this thesis, we extend the BO framework in a number of important directions to incorporate some of these constraints.
First, we introduce a constrained BO framework where instead of selecting a precise point at each iteration, we request a constrained experiment that is characterized by a hyper-rectangle in the input space. We introduce efficient sequential and non-sequential algorithms to select a set of constrained experiments that best optimize f(���) within a given budget. Second, we introduce one of the first attempts in batch BO where instead of selecting one experiment at each iteration, a set of k > 1 experiments is selected. This can significantly speedup the overall running time of BO. Third, we introduce scheduling algorithms for the BO framework when: 1) it is possible to run concurrent experiments; 2) the durations of experiments are stochastic, but with a known distribution; and 3) there is a limited number of experiments to run in a fixed amount of time. We propose both online and offline scheduling algorithms that effectively handle these constraints. Finally, we introduce a hybrid BO approach which switches between the sequential and batch mode. The proposed hybrid approach provides us with a substantial speedup against sequential policies without significant performance loss. / Graduation date: 2013
|
5 |
Bayesian optimization for selecting training and validation data for supervised machine learning : using Gaussian processes both to learn the relationship between sets of training data and model performance, and to estimate model performance over the entire problem domain / Bayesiansk optimering för val av träning- och valideringsdata för övervakad maskininlärningBergström, David January 2019 (has links)
Validation and verification in machine learning is an open problem which becomes increasingly important as its applications becomes more critical. Amongst the applications are autonomous vehicles and medical diagnostics. These systems all needs to be validated before being put into use or else the consequences might be fatal. This master’s thesis focuses on improving both learning and validating machine learning models in cases where data can either be generated or collected based on a chosen position. This can for example be taking and labeling photos at the position or running some simulation which generates data from the chosen positions. The approach is twofold. The first part concerns modeling the relationship between any fixed-size set of positions and some real valued performance measure. The second part involves calculating such a performance measure by estimating the performance over a region of positions. The result is two different algorithms, both variations of Bayesian optimization. The first algorithm models the relationship between a set of points and some performance measure while also optimizing the function and thus finding the set of points which yields the highest performance. The second algorithm uses Bayesian optimization to approximate the integral of performance over the region of interest. The resulting algorithms are validated in two different simulated environments. The resulting algorithms are applicable not only to machine learning but can also be used to optimize any function which takes a set of positions and returns a value, but are more suitable when the function is expensive to evaluate.
|
6 |
Bayesian Optimization and Semiparametric Models with Applications to Assistive TechnologySnoek, Jasper Roland 14 January 2014 (has links)
Advances in machine learning are having a profound impact on disciplines spanning the sciences. Assistive technology and health informatics are fields for which minor improvements achieved through leveraging more advanced machine learning algorithms can translate to major real world impact. However, successful application of machine learning currently requires broad domain knowledge to determine which model is appropriate for a given task, and model specific expertise to configure a model to a problem of interest. A major motivation for this thesis was: How can we make machine learning more accessible to assistive technology and health informatics researchers? Naturally, a complementary goal is to make machine learning more accessible in general. Specifically, in this thesis we explore how to automate the role of a machine learning expert through automatically adapting models and adjusting parameters to a given task of interest. This thesis consists of a number of contributions towards solving this challenging open problem in machine learning and these are empirically validated on four real-world applications.
Through an interesting theoretical link between two seemingly disparate latent variable models, we create a hybrid model that allows one to flexibly interpolate over a parametric unsupervised neural network, a classification neural network and a non-parametric Gaussian process. We demonstrate empirically that this non-parametrically guided autoencoder allows one to learn a latent representation that is more useful for a given task of interest.
We establish methods for automatically configuring machine learning model hyperparameters using Bayesian optimization. We develop Bayesian methods for integrating over parameters, explore the use of different priors over functions, and develop methods to run experiments in parallel. We demonstrate empirically that these methods find better hyperparameters on recent benchmark problems spanning machine learning in significantly less experiments than the methods employed by the problems' authors. We further establish methods for incorporating parameter dependent variable cost in the optimization procedure. These methods find better hyperparameters in less cost, such as time, or within bounded cost, such as before a deadline. Additionally, we develop a constrained Bayesian optimization variant and demonstrate its superiority over the standard procedure in the presence of unknown constraints.
|
7 |
Bayesian Optimization and Semiparametric Models with Applications to Assistive TechnologySnoek, Jasper Roland 14 January 2014 (has links)
Advances in machine learning are having a profound impact on disciplines spanning the sciences. Assistive technology and health informatics are fields for which minor improvements achieved through leveraging more advanced machine learning algorithms can translate to major real world impact. However, successful application of machine learning currently requires broad domain knowledge to determine which model is appropriate for a given task, and model specific expertise to configure a model to a problem of interest. A major motivation for this thesis was: How can we make machine learning more accessible to assistive technology and health informatics researchers? Naturally, a complementary goal is to make machine learning more accessible in general. Specifically, in this thesis we explore how to automate the role of a machine learning expert through automatically adapting models and adjusting parameters to a given task of interest. This thesis consists of a number of contributions towards solving this challenging open problem in machine learning and these are empirically validated on four real-world applications.
Through an interesting theoretical link between two seemingly disparate latent variable models, we create a hybrid model that allows one to flexibly interpolate over a parametric unsupervised neural network, a classification neural network and a non-parametric Gaussian process. We demonstrate empirically that this non-parametrically guided autoencoder allows one to learn a latent representation that is more useful for a given task of interest.
We establish methods for automatically configuring machine learning model hyperparameters using Bayesian optimization. We develop Bayesian methods for integrating over parameters, explore the use of different priors over functions, and develop methods to run experiments in parallel. We demonstrate empirically that these methods find better hyperparameters on recent benchmark problems spanning machine learning in significantly less experiments than the methods employed by the problems' authors. We further establish methods for incorporating parameter dependent variable cost in the optimization procedure. These methods find better hyperparameters in less cost, such as time, or within bounded cost, such as before a deadline. Additionally, we develop a constrained Bayesian optimization variant and demonstrate its superiority over the standard procedure in the presence of unknown constraints.
|
8 |
Information Exploration and Exploitation for Machine Learning with Small Data / 小データを用いた機械学習のための情報の探索と活用Hayashi, Shogo 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23313号 / 情博第749号 / 新制||情||128(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 鹿島 久嗣, 教授 山本 章博, 教授 吉川 正俊 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
9 |
Automated Machine Learning for Time Series ForecastingRosenberger, Daniel 26 April 2022 (has links)
Time series forecasting has become a common problem in day-to-day applications and various machine learning algorithms have been developed to tackle this task. Finding the model that performs the best forecasting on a given dataset can be time consuming as multiple algorithms and hyperparameter configurations must be examined to find the best model. This problem can be solved using automated machine learning, an approach that automates all steps required for developing a machine learning algorithm including finding the best algorithm and hyperparameter configuration. This study develops and builds an automated machine learning pipeline focused on finding the best forecasting model for a given dataset. This includes choosing different forecasting algorithms to cover a wide range of tasks and identifying the best method to find the best model in these algorithms. Lastly, the final pipeline will then be tested on a variety of datasets to evaluate the performance on time series data with different characteristics.:Abstract
List of Figures
List of Tables
List of Abbreviations
List of Symbols
1. Introduction
2. Theoretical Background
2.1. Machine Learning
2.2. Automated Machine Learning
2.3. Hyperparameter Optimization
2.3.1. Model-Free Methods
2.3.2. Bayesian Optimization
3. Time Series Forecasting Algorithms
3.1. Time Series Data
3.2. Baselines
3.2.1. Naive Forecast
3.2.2. Moving Average
3.3. Linear Regression
3.4. Autoregression
3.5. SARIMAX
3.6. XGBoost
3.7. LSTM Neural Network
4. Automated Machine Learning Pipeline
4.1. Data Preparation
4.2. Model Selection
4.3. Hyperparameter Optimization Method
4.3.1. Sequential Model-Based Algorithm Configuration
4.3.2. Tree-structured Parzen Estimator
4.3.3. Comparison of Bayesian Optimization Hyperparameter Optimization Methods
4.4. Pipeline Structure
5. Testing on external Datasets
5.1. Beijing PM2.5 Pollution
5.2. Perrin Freres Monthly Champagne Sales
6. Testing on internal Datasets
6.1. Deutsche Telekom Call Count
6.1.1. Comparison of Bayesian Optimization and Random Search
6.2. Deutsche Telekom Call Setup Time
7. Conclusion
Bibliography
A. Details Search Space
B. Pipeline Results - Predictions
C. Pipeline Results - Configurations
D. Pipeline Results - Experiment Details
E. Deutsche Telekom Data Usage Permissions
|
10 |
Bayesian Optimization for Engineering Design and Quality Control of Manufacturing SystemsAlBahar, Areej Ahmad 14 April 2022 (has links)
Manufacturing systems are usually nonlinear, nonstationary, highly corrupted with outliers, and oftentimes constrained by physical laws. Modeling and approximation of their underly- ing response surface functions are extremely challenging. Bayesian optimization is a great statistical tool, based on Bayes rule, used to optimize and model these expensive-to-evaluate functions. Bayesian optimization comprises of two important components namely, a sur- rogate model often the Gaussian process and an acquisition function often the expected improvement. The Gaussian process, known for its outstanding modeling and uncertainty quantification capabilities, is used to represent the underlying response surface function, while the expected improvement is used to select the next point to be evaluated by trading- off exploitation and exploration.
Although Bayesian optimization has been extensively used in optimizing unknown and expensive-to-evaluate functions and in hyperparameter tuning of deep learning models, mod- eling highly outlier-corrupted, nonstationary, and stress-induced response surface functions hinder the use of conventional Bayesian optimization models in manufacturing systems. To overcome these limitations, we propose a series of systematic methodologies to improve Bayesian optimization for engineering design and quality control of manufacturing systems. Specifically, the contributions of this dissertation can be summarized as follows.
1. A novel asymmetric robust kernel function, called AEN-RBF, is proposed to model highly outlier-corrupted functions. Two new hyperparameters are introduced to im- prove the flexibility and robustness of the Gaussian process model.
2. A nonstationary surrogate model that utilizes deep multi-layer Gaussian processes, called MGP-CBO, is developed to improve the modeling of complex anisotropic con- strained nonstationary functions.
3. A Stress-Aware Optimal Actuator Placement framework is designed to model and op- timize stress-induced nonlinear constrained functions.
Through extensive evaluations, the proposed methodologies have shown outstanding and significant improvements when compared to state-of-the-art models. Although these pro- posed methodologies have been applied to certain manufacturing systems, they can be easily adapted to other broad ranges of problems. / Doctor of Philosophy / Modeling advanced manufacturing systems, such as engineering design and quality moni- toring and control, is extremely challenging. The underlying response surface functions of these manufacturing systems are often nonlinear, nonstationary, and expensive-to-evaluate. Bayesian optimization, a statistical modeling approach based on Bayes rule, is used to rep- resent and model those complex (i.e., black-box) objective functions. A Bayesian optimiza- tion model consists of a surrogate model, often the Gaussian process, and an acquisition function, often the expected improvement. Conventional Bayesian optimization models do not accurately represent non-stationary and outlier-corrupted functions. To overcome these limitations, we propose a new asymmetric robust kernel function to improve the model- ing capabilities of the Gaussian process model in process quality control through improved defect detection and classification. We also propose a non-stationary surrogate model to improve the performance of Bayesian optimization in aerospace process design problems. Finally, we develop a new optimization framework that models and optimizes stress-induced constrained aerospace manufacturing systems correctly. Our extensive experiments show significant improvements of these three proposed models when compared to state-of-the-art methodologies.
|
Page generated in 0.1239 seconds