Neural networks can perform an incredible array of complex tasks, but successfully training a network is difficult because it requires us to minimize a function about which we know very little. In practice, developing a good model requires both intuition and a lot of guess-and-check. In this dissertation, we study a type of fully-connected neural network that improves on standard rectifier networks while retaining their useful properties. We then examine this type of network and its loss function from a probabilistic perspective. This analysis leads to a new rule for parameter initialization and a new method for predicting effective learning rates for gradient descent. Experiments confirm that the theory behind these developments translates well into practice.
Identifer | oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-8531 |
Date | 01 July 2019 |
Creators | Hettinger, Christopher James |
Publisher | BYU ScholarsArchive |
Source Sets | Brigham Young University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | http://lib.byu.edu/about/copyright/ |
Page generated in 0.0168 seconds