Global ETD Search

1	An Investigation of the Interactions of Gradient Coherence and Network Pruning in Neural Networks Yauney, Zachary 29 April 2024 (has links) (PDF) We investigate the coherent gradient hypothesis and show that the coherence measurements are different on real and random data regardless of the network's initialization. We introduce "diffs," an attempt at an element-wise approximation at coherence, and investigate their properties. We study how coherence is affected by increasing the width of simple fully-connected networks. We then prune those fully-connected networks and find that sparse networks outperform dense networks with the same number of nonzero parameters. In addition, we show that it is possible to increase the performance of a sparse network by scaling the size of the dense parent network it is derived from. Finally we apply our pruning methods to ResNet50 and ViT and find that diff-based pruning can be competitive with other methods. neural network pruning generalization gradient coherence Physical Sciences and Mathematics
2	Gradient Conditioning in Deep Neural Networks Nelson, Michael Vernon 04 August 2022 (has links) When using Stochastic Gradient Descent (SGD) to train Artificial Neural Networks, gradient variance comes from two sources: differences in the weights of the network when each batch gradient is estimated and differences between the input values in each batch. Some architectural traits, like skip-connections and batch-normalization, allow much deeper networks to be trained by reducing each type of variance and improving the conditioning of the network gradient with respect to both the weights and the input. It is still unclear to which degree each property is responsible for these dramatic stability improvements when training deep networks. This thesis summarizes previous findings related to gradient conditioning in each case, demonstrates efficient methods by which each can be measured independently, and investigates the contribution each makes to the stability and speed of SGD in various architectures as network depth increases. gradient conditioning gradient step-consistency gradient batch-dissonance gradient whitening gradient confusion gradient coherence gradient diversity Physical Sciences and Mathematics

Search results

An Investigation of the Interactions of Gradient Coherence and Network Pruning in Neural Networks

Gradient Conditioning in Deep Neural Networks