• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

On the use of $\alpha$-stable random variables in Bayesian bridge regression, neural networks and kernel processes.pdf

Jorge E Loria (18423207) 23 April 2024 (has links)
<p dir="ltr">The first chapter considers the l_α regularized linear regression, also termed Bridge regression. For α ∈ (0, 1), Bridge regression enjoys several statistical properties of interest such</p><p dir="ltr">as sparsity and near-unbiasedness of the estimates (Fan & Li, 2001). However, the main difficulty lies in the non-convex nature of the penalty for these values of α, which makes an</p><p dir="ltr">optimization procedure challenging and usually it is only possible to find a local optimum. To address this issue, Polson et al. (2013) took a sampling based fully Bayesian approach to this problem, using the correspondence between the Bridge penalty and a power exponential prior on the regression coefficients. However, their sampling procedure relies on Markov chain Monte Carlo (MCMC) techniques, which are inherently sequential and not scalable to large problem dimensions. Cross validation approaches are similarly computation-intensive. To this end, our contribution is a novel non-iterative method to fit a Bridge regression model. The main contribution lies in an explicit formula for Stein’s unbiased risk estimate for the out of sample prediction risk of Bridge regression, which can then be optimized to select the desired tuning parameters, allowing us to completely bypass MCMC as well as computation-intensive cross validation approaches. Our procedure yields results in a fraction of computational times compared to iterative schemes, without any appreciable loss in statistical performance.</p><p><br></p><p dir="ltr">Next, we build upon the classical and influential works of Neal (1996), who proved that the infinite width scaling limit of a Bayesian neural network with one hidden layer is a Gaussian process, when the network weights have bounded prior variance. Neal’s result has been extended to networks with multiple hidden layers and to convolutional neural networks, also with Gaussian process scaling limits. The tractable properties of Gaussian processes then allow straightforward posterior inference and uncertainty quantification, considerably simplifying the study of the limit process compared to a network of finite width. Neural network weights with unbounded variance, however, pose unique challenges. In this case, the classical central limit theorem breaks down and it is well known that the scaling limit is an α-stable process under suitable conditions. However, current literature is primarily limited to forward simulations under these processes and the problem of posterior inference under such a scaling limit remains largely unaddressed, unlike in the Gaussian process case. To this end, our contribution is an interpretable and computationally efficient procedure for posterior inference, using a conditionally Gaussian representation, that then allows full use of the Gaussian process machinery for tractable posterior inference and uncertainty quantification in the non-Gaussian regime.</p><p><br></p><p dir="ltr">Finally, we extend on the previous chapter, by considering a natural extension to deep neural networks through kernel processes. Kernel processes (Aitchison et al., 2021) generalize to deeper networks the notion proved by Neal (1996) by describing the non-linear transformation in each layer as a covariance matrix (kernel) of a Gaussian process. In this way, each succesive layer transforms the covariance matrix in the previous layer by a covariance function. However, the covariance obtained by this process loses any possibility of representation learning since the covariance matrix is deterministic. To address this, Aitchison et al. (2021) proposed deep kernel processes using Wishart and inverse Wishart matrices for each layer in deep neural networks. Nevertheless, the approach they propose requires using a process that does not emerge from the limit of a classic neural network structure. We introduce α-stable kernel processes (α-KP) for learning posterior stochastic covariances in each layer. Our results show that our method is much better than the approach proposed by Aitchison et al. (2021) in both simulated data and the benchmark Boston dataset.</p>
2

Modelling approach and avoidance behaviour : A deep learning approach to understand the human olfactory system / Modellering av beteende för närmande och frånstötning : En djupinlärningsapproach för att förstå det mänskliga luktsystemet

Nordén, Frans January 2021 (has links)
In this thesis we examine the question whether it is possible to model approach and avoidance behaviour with probabilistic machine learning. The results from this project will primarily aid in our collective understanding of human existence. Secondly, it will extend the knowledge with regards to probabilistic machine learning in the Neuroscience domain. We aid this through building a Variational Recurrent Neural Network (VRNN) that is trained on Electroencephalography (EEG)-data from participants that is subjected to odours with varying pleasantness. The pleasantness of the odours is used to divide the participants into two classes based on their self reported experience. This data is used to train the VRNN. The performance of the VRNN is evaluated by how well we are able to reconstruct the original data from a low dimensional latent representation. In this task the model performs on a similar level as related works. We further investigate how changes in the latent space effects reconstructed data. Despite being disentangled, the latent variables are hard to interpret. Furthermore we try to classify and cluster the latent space as either approach or avoidance behaviour with a Support Vector Machine and Uniform Manifold Approximation. The classification results are only slightly better than random, indicating that the learned latent space is not suitable for the task This is most likely due to the patterns that make up approach and avoidance behaviour is seen as noise by the VRNN. This leads to the patterns not being accurately modelled. This is shown by the evidence that frontal α -asymmetry that exists in the data is not reconstructed by the model. The conclusion is therefore that a VRNN is less suitable for modelling underlying behaviour from raw EEG data due to the low signal to noise ratio. We instead suggests to focus on specific frequency ranges in specific regions when applying machine learning in this domain. / Den här uppsatsen behandlar frågan huruvida det är möjligt att modellera närmande och frånstötande beteendemönster med hjälp av maskininlärning. Resultaten från detta projekt ämnar huvudsakligen att främja vidare förståelse av den mänskliga existensen. Vidare ämnar den även att utvidga förståelsen av hur probabilistisk maskininlärning kan användas för att utforska dylika hänseenden. Vi genomför detta genom att bygga en Variational Recurrent Neural Network-modell (VRNN) som tränas på data från experiment där personer utsätts för olika lukter samtidigt som deras Elektroencefalografi (EEG) spelas in. Deltagarna delas in i två klasser beroende på deras självrapporterade upplevelse av luktens njutbarhet. Maskininlärningsmodellen utvärderas genom att vi analyserar hur väl den lyckas rekonstruera datan. Detta lyckas den väl med. Vidare så undersöker vi hur förändringar i modellens latenta rum påverkar rekonstrueringen av datan. Resultaten från det experimentet är ej tydliga. Vidare så försöker vi klassificera och klustra det latenta rummet med avseende på närmande och frånstötande beteende med hjälp av en Support Vector Machine och Uniform Manifold Approximation. Resultaten från dessa experiment är att vi inte lyckas klassificera eller klustra det latenta rummet med avseende på närmande och frånstötande beteende bättre än slumpen. Vi argumenterar för att detta beror på att de underliggande mönster som skapar dessa beteenden ses som brus av VRNN-modellen och därmed inte modelleras. Detta visas genom att frontal α-asymmetri som existerar i datan ej rekonstrueras av modellen. Slutsaten blir därmed att en VRNN är mindre passande att använda vid modellering av underliggande beteenden av obehandlad EEG data. Detta på grund av det låga signal till brus-förhållandet i EEG-datan. Vi föreslår att istället fokusera på specifika frekvensområden i specifika hjärnregioner när maskininlärning appliceras på EEG.

Page generated in 0.0831 seconds