Return to search

Hyperparameters for Dense Neural Networks

Neural networks can perform an incredible array of complex tasks, but successfully training a network is difficult because it requires us to minimize a function about which we know very little. In practice, developing a good model requires both intuition and a lot of guess-and-check. In this dissertation, we study a type of fully-connected neural network that improves on standard rectifier networks while retaining their useful properties. We then examine this type of network and its loss function from a probabilistic perspective. This analysis leads to a new rule for parameter initialization and a new method for predicting effective learning rates for gradient descent. Experiments confirm that the theory behind these developments translates well into practice.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-8531
Date01 July 2019
CreatorsHettinger, Christopher James
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttp://lib.byu.edu/about/copyright/

Page generated in 0.0056 seconds