Global ETD Search

Return to search

Open System Neural Networks

Recent advances in self-supervised learning have made it possible to reuse information-rich models that have been generally pre-trained on massive amounts of data for other downstream tasks. But the pre-training process can be drastically different from the fine-tuning training process, which can lead to inefficient learning. We address this disconnect in training dynamics by structuring the learning process like an open system in thermodynamics. Open systems can achieve a steady state when low-entropy inputs are converted to high-entropy outputs. We modify the the model and the learning process to mimic this behavior, and attend more to elements of the input sequence that exhibit greater changes in entropy. We call this architecture the Open System Neural Network (OSNN). We show the efficacy of the OSNN on multiple classification datasets with a variety of encoder-only Transformers. We find that the OSNN outperforms nearly all model specific baselines, and achieves a new state-of-the-art result on two classification datasets.

Physical Sciences and Mathematics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-11243
Date	12 January 2024
Creators	Hatch, Bradley
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.0018 seconds

Open System Neural Networks

Description

Links & Downloads

Tags

Additional Fields