Global ETD Search

Return to search

An Investigation of the Interactions of Gradient Coherence and Network Pruning in Neural Networks

We investigate the coherent gradient hypothesis and show that the coherence measurements are different on real and random data regardless of the network's initialization. We introduce "diffs," an attempt at an element-wise approximation at coherence, and investigate their properties. We study how coherence is affected by increasing the width of simple fully-connected networks. We then prune those fully-connected networks and find that sparse networks outperform dense networks with the same number of nonzero parameters. In addition, we show that it is possible to increase the performance of a sparse network by scaling the size of the dense parent network it is derived from. Finally we apply our pruning methods to ResNet50 and ViT and find that diff-based pruning can be competitive with other methods.

Physical Sciences and Mathematics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-11354
Date	29 April 2024
Creators	Yauney, Zachary
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.0016 seconds

An Investigation of the Interactions of Gradient Coherence and Network Pruning in Neural Networks

Description

Links & Downloads

Tags

Additional Fields