Global ETD Search

Return to search

Optimal Learning Rates for Neural Networks

Neural networks have long been known as universal function approximators and have more recently been shown to be powerful and versatile in practice. But it can be extremely challenging to find the right set of parameters and hyperparameters. Model training is both expensive and difficult due to the large number of parameters and sensitivity to hyperparameters such as learning rate and architecture. Hyperparameter searches are notorious for requiring tremendous amounts of processing power and human resources. This thesis provides an analytic approach to estimating the optimal value of one of the key hyperparameters in neural networks, the learning rate. Where possible, the analysis is computed exactly, and where necessary, approximations and assumptions are used and justified. The result is a method that estimates the optimal learning rate for a certain type of network, a fully connected CReLU network.

neural network

learning rate

crelu

Physical Sciences and Mathematics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-9662
Date	30 July 2020
Creators	Moncur, Tyler
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.002 seconds

Optimal Learning Rates for Neural Networks

Description

Links & Downloads

Tags

Additional Fields