Global ETD Search

1	Feed forward neural network entities Hadjiprocopis, Andreas January 2000 (has links) No description available. 006.3
2	Automatic step-size adaptation in incremental supervised learning Mahmood, Ashique Unknown Date No description available. step size supervised learning stochastic gradient descent
3	Optimisation for non-linear channel equalisation Sweeney, Fergal Jon January 1999 (has links) No description available. 621.3822 Gradient descent; Bit error rate
4	Automatic step-size adaptation in incremental supervised learning Mahmood, Ashique 11 1900 (has links) Performance and stability of many iterative algorithms such as stochastic gradient descent largely depend on a fixed and scalar step-size parameter. Use of a fixed and scalar step-size value may lead to limited performance in many problems. We study several existing step-size adaptation algorithms in nonstationary, supervised learning problems using simulated and real-world data. We discover that effectiveness of the existing step-size adaptation algorithms requires tuning of a meta parameter across problems. We introduce a new algorithm - Autostep - by combining several new techniques with an existing algorithm, and demonstrate that it can effectively adapt a vector step-size parameter on all of our training and test problems without tuning its meta parameter across them. Autostep is the first step-size adaptation algorithm that can be used in widely different problems with the same setting of all of its parameters. step size supervised learning stochastic gradient descent
5	Stochastic, distributed and federated optimization for machine learning Konečný, Jakub January 2017 (has links) We study optimization algorithms for the finite sum problems frequently arising in machine learning applications. First, we propose novel variants of stochastic gradient descent with a variance reduction property that enables linear convergence for strongly convex objectives. Second, we study distributed setting, in which the data describing the optimization problem does not fit into a single computing node. In this case, traditional methods are inefficient, as the communication costs inherent in distributed optimization become the bottleneck. We propose a communication-efficient framework which iteratively forms local subproblems that can be solved with arbitrary local optimization algorithms. Finally, we introduce the concept of Federated Optimization/Learning, where we try to solve the machine learning problems without having data stored in any centralized manner. The main motivation comes from industry when handling user-generated data. The current prevalent practice is that companies collect vast amounts of user data and store them in datacenters. An alternative we propose is not to collect the data in first place, and instead occasionally use the computational power of users' devices to solve the very same optimization problems, while alleviating privacy concerns at the same time. In such setting, minimization of communication rounds is the primary goal, and we demonstrate that solving the optimization problems in such circumstances is conceptually tractable.
6	Projected Wirtinger gradient descent for spectral compressed sensing Liu, Suhui 01 August 2017 (has links) In modern data and signal acquisition, one main challenge arises from the growing scale of data. The data acquisition devices, however, are often limited by physical and hardware constraints, precluding sampling with the desired rate and precision. It is thus of great interest to reduce the sensing complexity while retaining recovery resolution. And that is why we are interested in reconstructing a signal from a small number of randomly observed time domain samples. The main contributions of this thesis are as follows. First, we consider reconstructing a one-dimensional (1-D) spectrally sparse signal from a small number of randomly observed time-domain samples. The signal of interest is a linear combination of complex sinusoids at R distinct frequencies. The frequencies can assume any continuous values in the normalized frequency domain [0, 1). After converting the spectrally sparse signal into a low-rank Hankel structured matrix completion problem, we propose an efficient feasible point approach, named projected Wirtinger gradient descent (PWGD) algorithm, to efficiently solve this structured matrix completion problem. We give the convergence analysis of our proposed algorithms. We then apply this algorithm to a different formulation of structured matrix recovery: Hankel and Toeplitz mosaic structured matrix. The algorithms provide better recovery performance; and faster signal recovery than existing algorithms including atomic norm minimization (ANM) and Enhanced Matrix Completion (EMaC). We further accelerate our proposed algorithm by a scheme inspired by FISTA. Extensive numerical experiments are provided to illustrate the efficiency of our proposed algorithms. Different from earlier approaches, our algorithm can solve problems of very large dimensions very efficiently. Moreover, we extend our algorithms to signal recovery from noisy samples. Finally, we aim to reconstruct a two-dimension (2-D) spectrally sparse signal from a small size of randomly observed time-domain samples. We extend our algorithms to high-dimensional signal recovery from noisy samples and multivariate frequencies. Matrix Completion Projected Wirtinger Gradient Descent Signal Reconstruction Mathematics
7	Multiple Kernel Learning with Many Kernels Afkanpour, Arash Unknown Date No description available. Multiple Kernel Learning Stochastic Gradient Descent Greedy Coordinate Descent
8	Algorithme à gradients multiples pour l'optimisation multiobjectif en simulation de haute fidélité : application à l'aérodynamique compressible / Non disponible Zerbinati, Adrien 24 May 2013 (has links) En optimisation multiobjectif, les connaissances du front et de l’ensemble de Pareto sont primordiales pour résoudre un problème. Un grand nombre de stratégies évolutionnaires sont proposées dans la littérature classique. Ces dernières ont prouvé leur efficacité pour identifier le front de Pareto. Pour atteindre un tel résultat, ces algorithmes nécessitent un grand nombre d’évaluations. En ingénierie, les simulations numériques sont généralement réalisées par des modèles de haute-fidélité. Aussi, chaque évaluation demande un temps de calcul élevé. A l’instar des algorithmes mono-objectif, les gradients des critères, ainsi que les dérivées successives, apportent des informations utiles sur la décroissance des fonctions. De plus, de nombreuses méthodes numériques permettent d’obtenir ces valeurs pour un coût modéré. En s’appuyant sur les résultats théoriques obtenus par Inria, nous proposons un algorithme basé sur l’utilisation des gradients de descente. Ces travaux résument la caractérisation théorique de cette méthode et la validation sur des cas tests analytiques. Dans le cas où les gradients ne sont pas accessibles, nous proposons une stratégie basée sur la construction des métamodèles de Krigeage. Ainsi, au cours de l’optimisation, les critères sont évalués sur une surface de réponse et non par simulation. Le temps de calcul est considérablement réduit, au détriment de la précision. La méthode est alors couplée à une stratégie de progression du métamodèle. / In multiobjective optimization, the knowledge of the Pareto set provides valuable information on the reachable optimal performance. A number of evolutionary strategies have been proposed in the literature and proved to be successful to identify Pareto set. Howerver, these derivative free algorithms are very demanding in computational time. Today, in many areas of computational sciences, codes are developed that include the calculation of the gradient, cautiously validated and calibrated. Thus, an alternate method applicable when the gradients are known is introduced presently. Using a clever combination of the gradients, a descent direction common to all criteria is identified. As a natural outcome, the Multiple Gradient Descent Algorithm (MGDA) is defined as a generalization of the steepest-descent method and compared with PAES by numerical experiments. Using MGDA on a multiobjective optimization problem requires the evaluation of a large number of points with regards to criteria, and their gradients. In the particular case of CFD problems, each point evaluation is very costly. Thus here we also propose to construct metamodels and to calculate approximate gradients by local finite difference. Optimisation Multiobjectif Gradient de descente Optimization Multiobjective Gradient descent
9	Retrospective Approximation for Smooth Stochastic Optimization David T Newton (15369535) 30 April 2023 (has links) <p>Stochastic Gradient Descent (SGD) is a widely-used iterative algorithm for solving stochastic optimization problems for a smooth (and possibly non-convex) objective function via queries from a first-order stochastic oracle.</p> <p>In this dissertation, we critically examine SGD's choice of executing a single step as opposed to multiple steps between subsample updates. Our investigation leads naturally to generalizing SG into Retrospective Approximation (RA) where, during each iteration, a deterministic solver executes possibly multiple steps on a subsampled deterministic problem and stops when further solving is deemed unnecessary from the standpoint of statistical efficiency. RA thus leverages what is appealing for implementation -- during each iteration, a solver, e.g., L-BFGS with backtracking line search is used, as is, and the subsampled objected function is solved only to the extent necessary. We develop a complete theory using relative error of the observed gradients as the principal object, demonstrating that almost sure and L1 consistency of RA are preserved under especially weak conditions when sample sizes are increased at appropriate rates. We also characterize the iteration and oracle complexity (for linear and sub-linear solvers) of RA, and identify two practical termination criteria, one of which we show leads to optimal complexity rates. The message from extensive numerical experiments is that the ability of RA to incorporate existing second-order deterministic solvers in a strategic manner is useful both in terms of algorithmic trajectory as well as from the standpoint of dispensing with hyper-parameter tuning.</p> Optimisation Stochastic Optimization Machine Learning Stochastic Gradient Descent
10	Hyperparameters for Dense Neural Networks Hettinger, Christopher James 01 July 2019 (has links) Neural networks can perform an incredible array of complex tasks, but successfully training a network is difficult because it requires us to minimize a function about which we know very little. In practice, developing a good model requires both intuition and a lot of guess-and-check. In this dissertation, we study a type of fully-connected neural network that improves on standard rectifier networks while retaining their useful properties. We then examine this type of network and its loss function from a probabilistic perspective. This analysis leads to a new rule for parameter initialization and a new method for predicting effective learning rates for gradient descent. Experiments confirm that the theory behind these developments translates well into practice. neural networks backpropagation gradient descent crelu Mathematics Physical Sciences and Mathematics

Search results