Spelling suggestions: "subject:"invertible beural betworks"" "subject:"invertible beural conetworks""
1 |
Video Prediction with Invertible Linear EmbeddingsPottorff, Robert Thomas 01 June 2019 (has links)
Using recently popularized invertible neural network We predict future video frames from complex dynamic scenes. Our invertible linear embedding (ILE) demonstrates successful learning, prediction and latent state inference. In contrast to other approaches, ILE does not use any explicit reconstruction loss or simplistic pixel-space assumptions. Instead, it leverages invertibility to optimize the likelihood of image sequences exactly, albeit indirectly.Experiments and comparisons against state of the art methods over synthetic and natural image sequences demonstrate the robustness of our approach, and a discussion of future work explores the opportunities our method might provide to other fields in which the accurate analysis and forecasting of non-linear dynamic systems is essential.
|
2 |
Solving Forward and Inverse Problems for Seismic Imaging using Invertible Neural NetworksGupta, Naveen 11 July 2023 (has links)
Full Waveform Inversion (FWI) is a widely used optimization technique for subsurface imaging where the goal is to estimate the seismic wave velocity beneath the Earth's surface from the observed seismic data at the surface. The problem is primarily governed by the wave equation, which is a non-linear second-order partial differential equation. A number of approaches have been developed for FWI including physics-based iterative numerical solvers as well as data-driven machine learning (ML) methods. Existing numerical solutions to FWI suffer from three major challenges: (1) sensitivity to initial velocity guess (2) non-convex loss landscape, and (3) sensitivity to noise. Additionally, they suffer from high computational cost, making them infeasible to apply in complex real-world applications. Existing ML solutions for FWI only solve for the inverse and are prone to yield non-unique solutions. In this work, we propose to solve both forward and inverse problems jointly to alleviate the issue of non-unique solutions for an inverse problem. We study the FWI problem from a new perspective and propose a novel approach based on Invertible Neural Networks. This type of neural network is designed to learn bijective mappings between the input and target distributions and hence they present a potential solution to solve forward and inverse problems jointly. In this thesis, we developed a data-driven framework that can be used to learn forward and inverse mappings between any arbitrary input and output space. Our model, Invertible X-net, can be used to solve FWI to obtain high-quality velocity images and also predict the seismic waveforms data. We compare our model with the existing baseline mod- els and show that our model outperforms them in velocity reconstruction on the OpenFWI dataset. Additionally, we also compare the predicted waveforms with a baseline and ground truth and show that our model is capable of predicting highly accurate seismic waveforms simultaneously. / Master of Science / Recent advancements in deep learning have led to the development of sophisticated methods that can be used to solve scientific problems in many disciplines including medical imaging, geophysics, and signal processing. For example, in geophysics, we study the internal structure of the Earth from indirect physical measurements. Often, these kind of problems are challenging due to existence of non-unique and unstable solutions. In this thesis, we look at one such problem called Full Waveform Inversion which aims to estimate velocity of mechanical wave inside the Earth from wave amplitude observations on the surface. For this problem, we explore a special class of neural networks that allows to uniquely map the input and output space and thus alleviate the non-uniqueness and instability in performing Full Waveform Inversion for seismic imaging.
|
3 |
Exploring Normalizing Flow Modifications for Improved Model Expressivity / Undersökning av normalizing flow-modifikationer för förbättrad modelluttrycksfullhetJuschak, Marcel January 2023 (has links)
Normalizing flows represent a class of generative models that exhibit a number of attractive properties, but do not always achieve state-of-the-art performance when it comes to perceived naturalness of generated samples. To improve the quality of generated samples, this thesis examines methods to enhance the expressivity of discrete-time normalizing flow models and thus their ability to capture different aspects of the data. In the first part of the thesis, we propose an invertible neural network architecture as an alternative to popular architectures like Glow that require an individual neural network per flow step. Although our proposal greatly reduces the number of parameters, it has not been done before, as such architectures are believed to not be powerful enough. For this reason, we define two optional extensions that could greatly increase the expressivity of the architecture. We use augmentation to add Gaussian noise variables to the input to achieve arbitrary hidden-layer widths that are no longer dictated by the dimensionality of the data. Moreover, we implement Piecewise Affine Activation Functions that represent a generalization of Leaky ReLU activations and allow for more powerful transformations in every individual step. The resulting three models are evaluated on two simple synthetic datasets – the two moons dataset and one generated from a mixture of eight Gaussians. Our findings indicate that the proposed architectures cannot adequately model these simple datasets and thus do not represent alternatives to current stateof-the-art models. The Piecewise Affine Activation Function significantly improved the expressivity of the invertible neural network, but could not make use of its full potential due to inappropriate assumptions about the function’s input distribution. Further research is needed to ensure that the input to this function is always standard normal distributed. We conducted further experiments with augmentation using the Glow model and could show minor improvements on the synthetic datasets when only few flow steps (two, three or four) were used. However, in a more realistic scenario, the model would encompass many more flow steps. Lastly, we generalized the transformation in the coupling layers of modern flow architectures from an elementwise affine transformation to a matrixbased affine transformation and studied the effect this had on MoGlow, a flow-based model of motion. We could show that McMoGlow, our modified version of MoGlow, consistently achieved a better training likelihood than the original MoGlow on human locomotion data. However, a subjective user study found no statistically significant difference in the perceived naturalness of the samples generated. As a possible reason for this, we hypothesize that the improvements are subtle and more visible in samples that exhibit slower movements or edge cases which may have been underrepresented in the user study. / Normalizing flows representerar en klass av generativa modeller som besitter ett antal eftertraktade egenskaper, men som inte alltid uppnår toppmodern prestanda när det gäller upplevd naturlighet hos genererade data. För att förbättra kvaliteten på dessa modellers utdata, undersöker detta examensarbete metoder för att förbättra uttrycksfullheten hos Normalizing flows-modeller i diskret tid, och därmed deras förmåga att fånga olika aspekter av datamaterialet. I den första delen av uppsatsen föreslår vi en arkitektur uppbyggt av ett inverterbart neuralt nätverk. Vårt förslag är ett alternativ till populära arkitekturer som Glow, vilka kräver individuella neuronnät för varje flödessteg. Även om vårt förslag kraftigt minskar antalet parametrar har detta inte gjorts tidigare, då sådana arkitekturer inte ansetts vara tillräckligt kraftfulla. Av den anledningen definierar vi två oberoende utökningar till arkitekturen som skulle kunna öka dess uttrycksfullhet avsevärt. Vi använder så kallad augmentation, som konkatenerar Gaussiska brusvariabler till observationsvektorerna för att uppnå godtyckliga bredder i de dolda lagren, så att deras bredd inte längre begränsas av datadimensionaliteten. Dessutom implementerar vi Piecewise Affine Activation-funktioner (PAAF), vilka generaliserar Leaky ReLU-aktiveringar genom att möjliggöra mer kraftfulla transformationer i varje enskilt steg. De resulterande tre modellerna utvärderas med hjälp av två enkla syntetiska datamängder - ”the two moons dataset” och ett som genererats genom att blanda av åtta Gaussfördelningar. Våra resultat visar att de föreslagna arkitekturerna inte kan modellera de enkla datamängderna på ett tillfredsställande sätt, och därmed inte utgör kompetitiva alternativ till nuvarande moderna modeller. Den styckvisa aktiveringsfunktionen förbättrade det inverterbara neurala nätverkets uttrycksfullhet avsevärt, men kunde inte utnyttja sin fulla potential på grund av felaktiga antaganden om funktionens indatafördelning. Ytterligare forskning behövs för att hantera detta problem. Vi genomförde ytterligare experiment med augmentation av Glow-modellen och kunde påvisa vissa förbättringar på de syntetiska dataseten när endast ett fåtal flödessteg (två, tre eller fyra) användes. Däremot omfattar modeller i mer realistiska scenarion många fler flödessteg. Slutligen generaliserade vi transformationen i kopplingslagren hos moderna flödesarkitekturer från en elementvis affin transformation till en matrisbaserad affin transformation, samt studerade vilken effekt detta hade på MoGlow, en flödesbaserad modell av 3D-rörelser. Vi kunde visa att McMoGlow, vår modifierade version av MoGlow, konsekvent uppnådde bättre likelihood i träningen än den ursprungliga MoGlow gjorde på mänskliga rörelsedata. En subjektiv användarstudie på exempelrörelser genererade från MoGlow och McMoGlow visade dock ingen statistiskt signifikant skillnad i användarnas uppfattning av hur naturliga rörelserna upplevdes. Som en möjlig orsak till detta antar vi att förbättringarna är subtila och mer synliga i situationer som uppvisar långsammare rörelser eller i olika gränsfall som kan ha varit underrepresenterade i användarstudien.
|
Page generated in 0.0703 seconds