• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Open System Neural Networks

Hatch, Bradley 12 January 2024 (has links) (PDF)
Recent advances in self-supervised learning have made it possible to reuse information-rich models that have been generally pre-trained on massive amounts of data for other downstream tasks. But the pre-training process can be drastically different from the fine-tuning training process, which can lead to inefficient learning. We address this disconnect in training dynamics by structuring the learning process like an open system in thermodynamics. Open systems can achieve a steady state when low-entropy inputs are converted to high-entropy outputs. We modify the the model and the learning process to mimic this behavior, and attend more to elements of the input sequence that exhibit greater changes in entropy. We call this architecture the Open System Neural Network (OSNN). We show the efficacy of the OSNN on multiple classification datasets with a variety of encoder-only Transformers. We find that the OSNN outperforms nearly all model specific baselines, and achieves a new state-of-the-art result on two classification datasets.

Page generated in 0.2466 seconds