Global ETD Search

1	Explanation and Downscalability of Google's Dependency Parser Parsey McParseface Endreß, Hannes 10 January 2023 (has links) Using the data collected during the hyperparameter tuning for Google's Dependency Parser Parsey McParseface, Feedforward neural networks and the correlation between its hyperparameter during the networks training are explained and analysed in depth.:1 Introduction to Neural Networks 4 1.1 History of AI 4 1.2 The role of Neural Networks in AI Research 6 1.2.1 Artificial Intelligence 6 1.2.2 Machine Learning 6 1.2.3 Neural Network 8 1.3 Structure of Neural Networks 8 1.3.1 Biology Analogy of Artificial Neural Networks 9 1.3.2 Architecture of Artificial Neural Networks 9 1.3.3 Biological Model of Nodes – Neurons 11 1.3.4 Structure of Artificial Neurons 12 1.4 Training a Neural Network 21 1.4.1 Data 21 1.4.2 Hyperparameters 22 1.4.3 Training process 26 1.4.4 Overfitting 27 2 Natural Language Processing (NLP) 29 2.1 Data Preparation 29 2.1.1 Text Preprocessing 29 2.1.2 Part-of-Speech Tagging 30 2.2 Dependency Parsing 31 2.2.1 Dependency Grammar 31 2.2.2 Dependency Parsing Rule-Based & Data-Driven Approach 33 2.2.3 Syntactic Parser 33 2.3 Parsey McParseface 34 2.3.1 SyntaxNet 34 2.3.2 Corpus 34 2.3.3 Architecture 34 2.3.4 Improvements to the Feed Forward Neural Network 38 3 Training of Parsey’s Cousins 41 3.1 Training a Model 41 3.1.1 Building the Framework 41 3.1.2 Corpus 41 3.1.3 Training Process 43 3.1.4 Settings for the Training 44 3.2 Results and Analysis 46 3.2.1 Results from Google’s Models 46 3.2.2 Effect of Hyperparameter 47 4 Conclusion 63 5 Bibliography 65 6 Appendix 74
2	On Bayesian optimization and its application to hyperparameter tuning Matosevic, Antonio January 2018 (has links) This thesis introduces the concept of Bayesian optimization, primarly used in optimizing costly black-box functions. Besides theoretical treatment of the topic, the focus of the thesis is on two numerical experiments. Firstly, different types of acquisition functions, which are the key components responsible for the performance, are tested and compared. Special emphasis is on the analysis of a so-called exploration-exploitation trade-off. Secondly, one of the most recent applications of Bayesian optimization concerns hyperparameter tuning in machine learning algorithms, where the objective function is expensive to evaluate and not given analytically. However, some results indicate that much simpler methods can give similar results. Our contribution is therefore a statistical comparison of simple random search and Bayesian optimization in the context of finding the optimal set of hyperparameters in support vector regression. It has been found that there is no significant difference in performance of these two methods. Optimization Bayesian statistics Hyperparameter tuning Machine learning Mathematics Matematik
3	Verktyg för hyperparameteroptimering / Hyperparameter Tuning Tools Lundberg, Patrick January 2021 (has links) Hyperparameteroptimering är ett viktigt uppdrag för att effektivt kunna använda en modell för maskininlärning. Att utföra detta manuellt kan vara tidskrävande, utan garanti för god kvalitet på resulterande hyperparametrar. Att använda verktyg för detta ändamål är att föredra, men det finns ett stort antal verktyg som använder olika algoritmer. Hur effektiva dessa olika verktyg är relativt varandra är ett mindre utforskat område. Denna studie bidrar med en enkel analys av hur två verktyg för sökning av hyperparametrar, Scikit och Ray Tune, fungerar i jämförelse med varandra. Computer and Information Sciences Data- och informationsvetenskap
4	A Research on Automatic Hyperparameter Recommendation via Meta-Learning Deng, Liping 01 May 2023 (has links) (PDF) The performance of classification algorithms is mainly governed by the hyperparameter configurations deployed. Traditional search-based algorithms tend to require extensive hyperparameter evaluations to select the desirable configurations during the process, and they are often very inefficient for implementations on large-scale tasks. In this dissertation, we resort to solving the problem of hyperparameter selection via meta-learning which provides a mechanism that automatically recommends the promising ones without any inefficient evaluations. In its approach, a meta-learner is constructed on the metadata extracted from historical classification problems which directly determines the success of recommendations. Designing fine meta-learners to recommend effective hyperparameter configurations efficiently is of practical importance. This dissertation divides into six chapters: the first chapter presents the research background and related work, the second to the fifth chapters detail our main work and contributions, and the sixth chapter concludes the dissertation and pictures our possible future work. In the second and third chapters, we propose two (kernel) multivariate sparse-group Lasso (SGLasso) approaches for automatic meta-feature selection. Previously, meta-features were usually picked by researchers manually based on their preferences and experience or by wrapper method, which is either less effective or time-consuming. SGLasso, as an embedded feature selection model, can select the most effective meta-features during the meta-learner training and thus guarantee the optimality of both meta-features and meta-learner which are essential for successful recommendations. In the fourth chapter, we formulate the problem of hyperparameter recommendation as a problem of low-rank tensor completion. The hyperparameter search space was often stretched to a one-dimensional vector, which removes the spatial structure of the search space and ignores the correlations that existed between the adjacent hyperparameters and these characteristics are crucial in meta-learning. Our contributions are to instantiate the search space of hyperparameters as a multi-dimensional tensor and develop a novel kernel tensor completion algorithm that is applied to estimate the performance of hyperparameter configurations. In the fifth chapter, we propose to learn the latent features of performance space via denoising autoencoders. Although the search space is usually high-dimensional, the performance of hyperparameter configurations is usually correlated to each other to a certain degree and its main structure lies in a much lower-dimensional manifold that describes the performance distribution of the search space. Denoising autoencoders are applied to extract the latent features on which two effective recommendation strategies are built. Extensive experiments are conducted to verify the effectiveness of our proposed approaches, and various empirical outcomes have shown that our approaches can recommend promising hyperparameters for real problems and significantly outperform the state-of-the-art meta-learning-based methods as well as search algorithms such as random search, Bayesian optimization, and Hyperband. Automatic Hyperparameter Recommendation Classification Machine Learning Meta-Learning
5	Evaluating the effects of hyperparameter optimization in VizDoom Olsson, Markus, Malm, Simon, Witt, Kasper January 2022 (has links) Reinforcement learning is a machine learning technique in which an artificial intelligence agent is guided by positive and negative rewards to learn strategies. To guide the agent’s behavior in addition to the reward are its hyperparameters. These values control how the agent learns. These hyperparameters are rarely disclosed in contemporary research, making it hard to estimate the value of optimizing these hyperparameters. This study aims to partly compare three different popular reinforcement learning algorithms, Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C) and Deep Q Network (DQN), and partly investigate the effects of hyperparameter optimization of several hyperparameters for each algorithm. All the included algorithms showed a significant difference after hyperparameter optimization, resulting in higher performance. A2C showed the largest performance increase after hyperparameter optimization, and PPO performed the best of the three algorithms both with default and optimized hyperparameters. Vizdoom reinforcement learning hyperparameter optimization Information Systems
6	Learning Hyperparameters for Inverse Problems by Deep Neural Networks McDonald, Ashlyn Grace 08 May 2023 (has links) Inverse problems arise in a wide variety of applications including biomedicine, environmental sciences, astronomy, and more. Computing reliable solutions to these problems requires the inclusion of prior knowledge in a process that is often referred to as regularization. Most regularization techniques require suitable choices of regularization parameters. In this work, we will describe new approaches that use deep neural networks (DNN) to estimate these regularization parameters. We will train multiple networks to approximate mappings from observation data to individual regularization parameters in a supervised learning approach. Once the networks are trained, we can efficiently compute regularization parameters for newly-obtained data by forward propagation through the DNNs. The network-obtained regularization parameters can be computed more efficiently and may even lead to more accurate solutions compared to existing regularization parameter selection methods. Numerical results for tomography demonstrate the potential benefits of using DNNs to learn regularization parameters. / Master of Science / Inverse problems arise in a wide variety of applications including biomedicine, environmental sciences, astronomy, and more. With these types of problems, the goal is to reconstruct an approximation of the original input when we can only observe the output. However, the output often includes some sort of noise or error, which means that computing reliable solutions to these problems is difficult. In order to combat this problem, we can include prior knowledge about the solution in a process that is often referred to as regularization. Most regularization techniques require suitable choices of regularization parameters. In this work, we will describe new approaches that use deep neural networks (DNN) to obtain these parameters. We will train multiple networks to approximate mappings from observation data to individual regularization parameters in a supervised learning approach. Once the networks are trained, we can efficiently compute regularization parameters for newly-obtained data by forward propagation through the DNNs. The network-obtained regularization parameters can be computed more efficiently and may even lead to more accurate solutions compared to existing regularization parameter selection methods. Numerical results for tomography demonstrate the potential of using DNNs to learn regularization parameters. regularization neural networks inverse problems hyperparameter selection tomography
7	Multimodal Affective Computing Using Temporal Convolutional Neural Network and Deep Convolutional Neural Networks Ayoub, Issa 24 June 2019 (has links) Affective computing has gained significant attention from researchers in the last decade due to the wide variety of applications that can benefit from this technology. Often, researchers describe affect using emotional dimensions such as arousal and valence. Valence refers to the spectrum of negative to positive emotions while arousal determines the level of excitement. Describing emotions through continuous dimensions (e.g. valence and arousal) allows us to encode subtle and complex affects as opposed to discrete emotions, such as the basic six emotions: happy, anger, fear, disgust, sad and neutral. Recognizing spontaneous and subtle emotions remains a challenging problem for computers. In our work, we employ two modalities of information: video and audio. Hence, we extract visual and audio features using deep neural network models. Given that emotions are time-dependent, we apply the Temporal Convolutional Neural Network (TCN) to model the variations in emotions. Additionally, we investigate an alternative model that combines a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). Given our inability to fit the latter deep model into the main memory, we divide the RNN into smaller segments and propose a scheme to back-propagate gradients across all segments. We configure the hyperparameters of all models using Gaussian processes to obtain a fair comparison between the proposed models. Our results show that TCN outperforms RNN for the recognition of the arousal and valence emotional dimensions. Therefore, we propose the adoption of TCN for emotion detection problems as a baseline method for future work. Our experimental results show that TCN outperforms all RNN based models yielding a concordance correlation coefficient of 0.7895 (vs. 0.7544) on valence and 0.8207 (vs. 0.7357) on arousal on the validation dataset of SEWA dataset for emotion prediction. Temporal Convolutional Neural Networks Recurrent Neural Networks Gaussian Processes Hyperparameter Optimization Convolutional Neural Networks
8	Use of meta-learning for hyperparameter tuning of classification problems / Uso de meta-aprendizado para o ajuste de hiper-parâmetros em problemas de classificação Mantovani, Rafael Gomes 17 May 2018 (has links) Machine learning solutions have been successfully used to solve many simple and complex problems. However, their development process still relies on human experts to perform tasks such as data preprocessing, feature engineering and model selection. As the complexity of these tasks increases, so does the demand for automated solutions, namely Automated Machine Learning (AutoML). Most algorithms employed in these systems have hyperparameters whose configuration may directly affect their predictive performance. Therefore, hyperparameter tuning is a recurring task in AutoML systems. This thesis investigated how to efficiently automate hyperparameter tuning by means of Meta-learning. To this end, large-scale experiments were performed tuning the hyperparameters of different classification algorithms, and an enhanced experimental methodology was adopted throughout the thesis to explore and learn the hyperparameter profiles for different classification algorithms. The results also showed that in many cases the default hyperparameter settings induced models that are on par with those obtained by tuning. Hence, a new Meta-learning recommender system was proposed to identify when it is better to use default values and when to tune classification algorithms for each new dataset. The proposed system is capable of generalizing several learning processes into a single modular framework, along with the possibility of assigning different algorithms. Furthermore, a descriptive analysis of model predictions is used to identify which data characteristics affect the necessity for tuning in each one of the algorithms investigated in the thesis. Experimental results also demonstrated that the proposed recommender system reduced the time spent on optimization processes, without reducing the predictive performance of the induced models. Depending on the target algorithm, the Meta-learning recommender system can statistically outperform the baselines. The significance of these results opens a number of new avenues for future work. / Soluções de aprendizado de máquina tem sido cada vez mais usadas com sucesso para resolver problemas dos mais simples aos complexos. Entretanto, o processo de desenvolvimento de tais soluções ainda é um processo que depende da ação de especialistas humanos em tarefas como: pré-processamento dos dados, engenharia de features e seleção de modelos. Consequentemente, quando a complexidade destas tarefas atinge um nível muito alto, há a necessidade de soluções automatizadas, denominadas por Aprendizado de Máquina automatizado (AutoML). A maioria dos algoritmos usados em tais sistemas possuem hiper-parâmetros cujos valores podem afetar diretamente o desempenho preditivo dos modelos gerados. Assim sendo, o ajuste de hiper-parâmetros é uma tarefa recorrente no desenvolvimento de sistems de AutoML. Nesta tese investigou-se a automatização do ajuste de hiper-parâmetros por meio de Meta-aprendizado. Seguindo essa linha, experimentos massivos foram realizados para ajustar os hiper-parâmetros de diferentes algoritmos de classificação. Além disso, uma metodologia experimental aprimorada e adotada ao lngo da tese perimtiu identificar diferentes perfis de ajuste para diferentes algoritmos de classificação. Entretanto, os resultados também mostraram que em muitos casos as configurações default destes algoritmos induziram modelos mais precisos do que os obtidos por meio de ajuste. Assim, foi proposto um novo sistema de recomendação baseado em Meta-learning para identificar quando é melhor realizar o ajuste de parâmetros para os algoritmos de classificação ou apenas usar os valores default. O sistema proposto é capaz de generalizar várias etapas do aprendizado em um único framework modular, juntamente com a possibilidade de avaliar diferentes algoritmos de aprendizado de máquina. As análises descritivas das predições obtidas pelo sistema indicaram quais características podem ser responsáveis por determinar quando o ajuste se faz necessário para cada um dos algoritmos investigados na tese. Os resultados também demonstraram que o sistema recomendador proposto reduziu o tempo gasto com a otimização mantendo o desempenho preditivo dos modelos gerados. Além disso, dependendo do algoritmo de classificação modelado, o sistema foi estatisticamente superior aos baselines. A significância desdes resultados abre um novo número de oportunidades para trabalhos futuros. Ajuste de Hiper-parâmetros Classificaiton problems Hyperparameter tuning Meta-aprendizado Meta-learning Problemas de Classificação
9	Use of meta-learning for hyperparameter tuning of classification problems / Uso de meta-aprendizado para o ajuste de hiper-parâmetros em problemas de classificação Rafael Gomes Mantovani 17 May 2018 (has links) Machine learning solutions have been successfully used to solve many simple and complex problems. However, their development process still relies on human experts to perform tasks such as data preprocessing, feature engineering and model selection. As the complexity of these tasks increases, so does the demand for automated solutions, namely Automated Machine Learning (AutoML). Most algorithms employed in these systems have hyperparameters whose configuration may directly affect their predictive performance. Therefore, hyperparameter tuning is a recurring task in AutoML systems. This thesis investigated how to efficiently automate hyperparameter tuning by means of Meta-learning. To this end, large-scale experiments were performed tuning the hyperparameters of different classification algorithms, and an enhanced experimental methodology was adopted throughout the thesis to explore and learn the hyperparameter profiles for different classification algorithms. The results also showed that in many cases the default hyperparameter settings induced models that are on par with those obtained by tuning. Hence, a new Meta-learning recommender system was proposed to identify when it is better to use default values and when to tune classification algorithms for each new dataset. The proposed system is capable of generalizing several learning processes into a single modular framework, along with the possibility of assigning different algorithms. Furthermore, a descriptive analysis of model predictions is used to identify which data characteristics affect the necessity for tuning in each one of the algorithms investigated in the thesis. Experimental results also demonstrated that the proposed recommender system reduced the time spent on optimization processes, without reducing the predictive performance of the induced models. Depending on the target algorithm, the Meta-learning recommender system can statistically outperform the baselines. The significance of these results opens a number of new avenues for future work. / Soluções de aprendizado de máquina tem sido cada vez mais usadas com sucesso para resolver problemas dos mais simples aos complexos. Entretanto, o processo de desenvolvimento de tais soluções ainda é um processo que depende da ação de especialistas humanos em tarefas como: pré-processamento dos dados, engenharia de features e seleção de modelos. Consequentemente, quando a complexidade destas tarefas atinge um nível muito alto, há a necessidade de soluções automatizadas, denominadas por Aprendizado de Máquina automatizado (AutoML). A maioria dos algoritmos usados em tais sistemas possuem hiper-parâmetros cujos valores podem afetar diretamente o desempenho preditivo dos modelos gerados. Assim sendo, o ajuste de hiper-parâmetros é uma tarefa recorrente no desenvolvimento de sistems de AutoML. Nesta tese investigou-se a automatização do ajuste de hiper-parâmetros por meio de Meta-aprendizado. Seguindo essa linha, experimentos massivos foram realizados para ajustar os hiper-parâmetros de diferentes algoritmos de classificação. Além disso, uma metodologia experimental aprimorada e adotada ao lngo da tese perimtiu identificar diferentes perfis de ajuste para diferentes algoritmos de classificação. Entretanto, os resultados também mostraram que em muitos casos as configurações default destes algoritmos induziram modelos mais precisos do que os obtidos por meio de ajuste. Assim, foi proposto um novo sistema de recomendação baseado em Meta-learning para identificar quando é melhor realizar o ajuste de parâmetros para os algoritmos de classificação ou apenas usar os valores default. O sistema proposto é capaz de generalizar várias etapas do aprendizado em um único framework modular, juntamente com a possibilidade de avaliar diferentes algoritmos de aprendizado de máquina. As análises descritivas das predições obtidas pelo sistema indicaram quais características podem ser responsáveis por determinar quando o ajuste se faz necessário para cada um dos algoritmos investigados na tese. Os resultados também demonstraram que o sistema recomendador proposto reduziu o tempo gasto com a otimização mantendo o desempenho preditivo dos modelos gerados. Além disso, dependendo do algoritmo de classificação modelado, o sistema foi estatisticamente superior aos baselines. A significância desdes resultados abre um novo número de oportunidades para trabalhos futuros. Ajuste de Hiper-parâmetros Meta-aprendizado Problemas de Classificação Classificaiton problems Hyperparameter tuning Meta-learning
10	A Reward-based Algorithm for Hyperparameter Optimization of Neural Networks / En Belöningsbaserad Algoritm för Hyperparameteroptimering av Neurala Nätverk Larsson, Olov January 2020 (has links) Machine learning and its wide range of applications is becoming increasingly prevalent in both academia and industry. This thesis will focus on the two machine learning methods convolutional neural networks and reinforcement learning. Convolutional neural networks has seen great success in various applications for both classification and regression problems in a diverse range of fields, e.g. vision for self-driving cars or facial recognition. These networks are built on a set of trainable weights optimized on data, and a set of hyperparameters set by the designer of the network which will remain constant. For the network to perform well, the hyperparameters have to be optimized separately. The goal of this thesis is to investigate the use of reinforcement learning as a method for optimizing hyperparameters in convolutional neural networks built for classification problems. The reinforcement learning methods used are a tabular Q-learning and a new Q-learning inspired algorithm denominated max-table. These algorithms have been tested with different exploration policies based on each hyperparameter value’s covariance, precision or relevance to the performance metric. The reinforcement learning algorithms were mostly tested on the datasets CIFAR10 and MNIST fashion against a baseline set by random search. While the Q-learning algorithm was not able to perform better than random search, max-table was able to perform better than random search in 50% of the time on both datasets. Hyperparameterbased exploration policy using covariance and relevance were shown to decrease the optimizers’ performance. No significant difference was found between a hyperparameter based exploration policy using performance and an equally distributed exploration policy. / Maskininlärning och dess många tillämpningsområden blir vanligare i både akademin och industrin. Den här uppsatsen fokuserar på två maskininlärningsmetoder, faltande neurala nätverk och förstärkningsinlärning. Faltande neurala nätverk har sett stora framgångar inom olika applikationsområden både för klassifieringsproblem och regressionsproblem inom diverse fält, t.ex. syn för självkörande bilar eller ansiktsigenkänning. Dessa nätverk är uppbyggda på en uppsättning av tränbara parameterar men optimeras på data, samt en uppsättning hyperparameterar bestämda av designern och som hålls konstanta vilka behöver optimeras separat för att nätverket ska prestera bra. Målet med denna uppsats är att utforska användandet av förstärkningsinlärning som en metod för att optimera hyperparameterar i faltande neurala nätverk gjorda för klassifieringsproblem. De förstärkningsinlärningsmetoder som använts är en tabellarisk "Q-learning" samt en ny "Q-learning" inspirerad metod benämnd "max-table". Dessa algoritmer har testats med olika handlingsmetoder för utforskning baserade på hyperparameterarnas värdens kovarians, precision eller relevans gentemot utvärderingsmetriken. Förstärkningsinlärningsalgoritmerna var i största del testade på dataseten CIFAR10 och MNIST fashion och jämförda mot en baslinje satt av en slumpmässig sökning. Medan "Q-learning"-algoritmen inte kunde visas prestera bättre än den slumpmässiga sökningen, kunde "max-table" prestera bättre på 50\% av tiden på både dataseten. De handlingsmetoder för utforskning som var baserade på kovarians eller relevans visades minska algoritmens prestanda. Ingen signifikant skillnad kunde påvisas mellan en handlingsmetod baserad på hyperparametrarnas precision och en jämnt fördelad handlingsmetod för utforsking. Convolutional Neural Networks Reinforcement Learning Hyperparameter Optimization Faltande Neurala Nätverk Förstärkningsinlärning Hyperparameteroptimering Computer Engineering Datorteknik

Search results