• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Understanding The Effects of Incorporating Scientific Knowledge on Neural Network Outputs and Loss Landscapes

Elhamod, Mohannad 06 June 2023 (has links)
While machine learning (ML) methods have achieved considerable success on several mainstream problems in vision and language modeling, they are still challenged by their lack of interpretable decision-making that is consistent with scientific knowledge, limiting their applicability for scientific discovery applications. Recently, a new field of machine learning that infuses domain knowledge into data-driven ML approaches, termed Knowledge-Guided Machine Learning (KGML), has gained traction to address the challenges of traditional ML. Nonetheless, the inner workings of KGML models and algorithms are still not fully understood, and a better comprehension of its advantages and pitfalls over a suite of scientific applications is yet to be realized. In this thesis, I first tackle the task of understanding the role KGML plays at shaping the outputs of a neural network, including its latent space, and how such influence could be harnessed to achieve desirable properties, including robustness, generalizability beyond training data, and capturing knowledge priors that are of importance to experts. Second, I use and further develop loss landscape visualization tools to better understand ML model optimization at the network parameter level. Such an understanding has proven to be effective at evaluating and diagnosing different model architectures and loss functions in the field of KGML, with potential applications to a broad class of ML problems. / Doctor of Philosophy / My research aims to address some of the major shortcomings of machine learning, namely its opaque decision-making process and the inadequate understanding of its inner workings when applied in scientific problems. In this thesis, I address some of these shortcomings by investigating the effect of supplementing the traditionally data-centric method with human knowledge. This includes developing visualization tools that make understanding such practice and further advancing it easier. Conducting this research is critical to achieving wider adoption of machine learning in scientific fields as it builds up the community's confidence not only in the accuracy of the framework's results, but also in its ability to provide satisfactory rationale.
2

On Linear Mode Connectivity up to Permutation of Hidden Neurons in Neural Network : When does Weight Averaging work? / Anslutning i linjärt läge upp till permutation av dolda neuroner i neurala nätverk : När fungerar Viktmedelvärde?

Kalaivanan, Adhithyan January 2023 (has links)
Neural networks trained using gradient-based optimization methods exhibit a surprising phenomenon known as mode connectivity, where two independently trained network weights are not isolated low loss minima in the parameter space. Instead, they can be connected by simple curves along which the loss remains low. In case of linear mode connectivity up to permutation, even linear interpolations of the trained weights incur low loss when networks that differ by permutation of their hidden neurons are considered equivalent. While some recent research suggest that this implies existence of a single near-convex loss basin to which the parameters converge, others have empirically shown distinct basins corresponding to different strategies to solve the task. In some settings, averaging multiple network weights naively, without explicitly accounting for permutation invariance still results in a network with improved generalization. In this thesis, linear mode connectivity among a set of neural networks independently trained on labelled datasets, both naively and upon reparameterization to account for permutation invariance is studied. Specifically, the effect of hidden layer width on the connectivity is empirically evaluated. The experiments are conducted on a two dimensional toy classification problem, and the insights are extended to deeper networks trained on handwritten digits and images. It is argued that accounting for permutation of hidden neurons either explicitly or implicitly is necessary for weight averaging to improve test performance. Furthermore, the results indicate that the training dynamics induced by the optimization plays a significant role, and large model width alone may not be a sufficient condition for linear model connectivity. / Neurala nätverk som tränats med gradientbaserade optimeringsmetoder uppvisar ett överraskande fenomen som kallas modeconnectivity, där två oberoende tränade nätverksvikter inte är isolerade lågförlustminima i parameterutrymmet. Istället kan de kopplas samman med enkla kurvor längs vilka förlusten förblir låg. I händelse av linjär mode-anslutning upp till permutation medför även linjära interpolationer av de tränade vikterna låga förluster när nätverk som skiljer sig åt genom permutation av deras dolda neuroner anses vara likvärdiga. Medan en del nyare undersökningar tyder på att detta innebär att det finns en enda nära-konvex förlustbassäng till vilken parametrarna konvergerar, har andra empiriskt visat distinkta bassänger som motsvarar olika strategier för att lösa uppgiften. I vissa inställningar resulterar ett naivt medelvärde av flera nätverksvikter, utan att uttryckligen ta hänsyn till permutationsinvarians, fortfarande i ett nätverk med förbättrad generalisering. I den här avhandlingen studeras linjärmodsanslutningar mellan en uppsättning neurala nätverk som är oberoende tränade på märkta datamängder, både naivt och vid omparameterisering för att ta hänsyn till permutationsinvarians. Specifikt utvärderas effekten av dold lagerbredd på anslutningen empiriskt. Experimenten utförs på ett tvådimensionellt leksaksklassificeringsproblem, och insikterna utökas till djupare nätverk som tränas på handskrivna siffror och bilder. Det hävdas att redogörelse för permutation av dolda neuroner antingen explicit eller implicit är nödvändigt för viktgenomsnitt för att förbättra testprestanda. Dessutom indikerar resultaten att träningsdynamiken som induceras av optimeringen spelar en betydande roll, och enbart stor modellbredd kanske inte är ett tillräckligt villkor för linjär modellanslutning.
3

A Study of the Loss Landscape and Metastability in Graph Convolutional Neural Networks / En studie av lösningslandskapet och metastabilitet i grafiska faltningsnätverk

Larsson, Sofia January 2020 (has links)
Many novel graph neural network models have reported an impressive performance on benchmark dataset, but the theory behind these networks is still being developed. In this thesis, we study the trajectory of Gradient descent (GD) and Stochastic gradient descent (SGD) in the loss landscape of Graph neural networks by replicating Xing et al. [1] study for feed-forward networks. Furthermore, we empirically examine if the training process could be accelerated by an optimization algorithm inspired from Stochastic gradient Langevin dynamics and what effect the topology of the graph has on the convergence of GD by perturbing its structure. We find that the loss landscape is relatively flat and that SGD does not encounter any significant obstacles during its propagation. The noise-induced gradient appears to aid SGD in finding a stationary point with desirable generalisation capabilities when the learning rate is poorly optimized. Additionally, we observe that the topological structure of the graph plays a part in the convergence of GD but further research is required to understand how. / Många nya grafneurala nätverk har visat imponerande resultat på existerande dataset, dock är teorin bakom dessa nätverk fortfarande under utveckling. I denna uppsats studerar vi banor av gradientmetoden (GD) och den stokastiska gradientmetoden (SGD) i lösningslandskapet till grafiska faltningsnätverk genom att replikera studien av feed-forward nätverk av Xing et al. [1]. Dessutom undersöker vi empiriskt om träningsprocessen kan accelereras genom en optimeringsalgoritm inspirerad av Stokastisk gradient Langevin dynamik, samt om grafens topologi har en inverkan på konvergensen av GD genom att ändra strukturen. Vi ser att lösningslandskapet är relativt plant och att bruset inducerat i gradienten verkar hjälpa SGD att finna stabila stationära punkter med önskvärda generaliseringsegenskaper när inlärningsparametern har blivit olämpligt optimerad. Dessutom observerar vi att den topologiska grafstrukturen påverkar konvergensen av GD, men det behövs mer forskning för att förstå hur.

Page generated in 0.0646 seconds