• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Differential neural architecture search for tabular data : Efficient neural network design for tabular datasets

Medhage, Marcus January 2024 (has links)
Artificial neural networks are some of the most powerful machine learning models and have gained interest in the telecommunications domain as well as other fields and applications due to their strong performance and flexibility. Creating these models typically requires manually choosing their architecture along with other hyperparameters that are crucial for their performance. Neural Architecture Search (NAS) seeks to automate architecture choice and has gained increasing interest in recent years. In this thesis, we propose a new NAS method based on differential architecture search (DARTS) to find architectures of fully-connected feed forward networks on tabular datasets. We train a gating mechanism on a validation dataset and compare four candidate gate functions as a tool to determine the number of hidden units per hidden layer in our neural networks for different tasks. Our findings show that our new method can reliably find architectures that are more compact and outperform manually chosen architectures. Interestingly, we also found that extracting weights learned during the search process could generate models that achieve significantly higher and more stable performance than identical architectures retrained from scratch. Our method achieved equal in performance to that of another NAS-method, while only requiring half an hour of training compared to 280 hours. The trained models also demonstrated a competitive performance when benchmarked to other state-of-the-art machine learning models. The primary benefit of our method, stems from the extraction and fine-tuning of certain weights. Our results indicate that improvements from extracted weights could relate to the lottery ticket hypothesis of neural networks, which invites further study for a fuller understanding.

Page generated in 0.1088 seconds