• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 316
  • 115
  • 65
  • 34
  • 8
  • 6
  • 5
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 668
  • 135
  • 121
  • 86
  • 76
  • 73
  • 70
  • 67
  • 64
  • 58
  • 57
  • 56
  • 55
  • 52
  • 51
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

A Posteriori Error Analysis of Discontinuous Galerkin Methods for Elliptic Variational Inequalities

Porwal, Kamana January 2014 (has links) (PDF)
The main emphasis of this thesis is to study a posteriori error analysis of discontinuous Galerkin (DG) methods for the elliptic variational inequalities. The DG methods have become very pop-ular in the last two decades due to its nature of handling complex geometries, allowing irregular meshes with hanging nodes and different degrees of polynomial approximation on different ele-ments. Moreover they are high order accurate and stable methods. Adaptive algorithms refine the mesh locally in the region where the solution exhibits irregular behaviour and a posteriori error estimates are the main ingredients to steer the adaptive mesh refinement. The solution of linear elliptic problem exhibits singularities due to change in boundary con-ditions, irregularity of coefficients and reentrant corners in the domain. Apart from this, the solu-tion of variational inequality exhibits additional irregular behaviour due to occurrence of the free boundary (the part of the domain which is a priori unknown and must be found as a component of the solution). In the lack of full elliptic regularity of the solution, uniform refinement is inefficient and it does not yield optimal convergence rate. But adaptive refinement, which is based on the residuals ( or a posteriori error estimator) of the problem, enhance the efficiency by refining the mesh locally and provides the optimal convergence. In this thesis, we derive a posteriori error estimates of the DG methods for the elliptic variational inequalities of the first kind and the second kind. This thesis contains seven chapters including an introductory chapter and a concluding chap-ter. In the introductory chapter, we review some fundamental preliminary results which will be used in the subsequent analysis. In Chapter 2, a posteriori error estimates for a class of DG meth-ods have been derived for the second order elliptic obstacle problem, which is a prototype for elliptic variational inequalities of the first kind. The analysis of Chapter 2 is carried out for the general obstacle function therefore the error estimator obtained therein involves the min/max func-tion and hence the computation of the error estimator becomes a bit complicated. With a mild assumption on the trace of the obstacle, we have derived a significantly simple and easily com-putable error estimator in Chapter 3. Numerical experiments illustrates that this error estimator indeed behaves better than the error estimator derived in Chapter 2. In Chapter 4, we have carried out a posteriori analysis of DG methods for the Signorini problem which arises from the study of the frictionless contact problems. A nonlinear smoothing map from the DG finite element space to conforming finite element space has been constructed and used extensively, in the analysis of Chapter 2, Chapter 3 and Chapter 4. Also, a common property shared by all DG methods allows us to carry out the analysis in unified setting. In Chapter 5, we study the C0 interior penalty method for the plate frictional contact problem, which is a fourth order variational inequality of the second kind. In this chapter, we have also established the medius analysis along with a posteriori analy-sis. Numerical results have been presented at the end of every chapter to illustrate the theoretical results derived in respective chapters. We discuss the possible extension and future proposal of the work presented in the Chapter 6. In the last chapter, we have documented the FEM codes used in the numerical experiments.
262

Bilevel programming: reformulations, regularity, and stationarity

Zemkoho, Alain B. 12 June 2012 (has links)
We have considered the bilevel programming problem in the case where the lower-level problem admits more than one optimal solution. It is well-known in the literature that in such a situation, the problem is ill-posed from the view point of scalar objective optimization. Thus the optimistic and pessimistic approaches have been suggested earlier in the literature to deal with it in this case. In the thesis, we have developed a unified approach to derive necessary optimality conditions for both the optimistic and pessimistic bilevel programs, which is based on advanced tools from variational analysis. We have obtained various constraint qualifications and stationarity conditions depending on some constructive representations of the solution set-valued mapping of the follower’s problem. In the auxiliary developments, we have provided rules for the generalized differentiation and robust Lipschitzian properties for the lower-level solution setvalued map, which are of a fundamental interest for other areas of nonlinear and nonsmooth optimization. Some of the results of the aforementioned theory have then been applied to derive stationarity conditions for some well-known transportation problems having the bilevel structure.
263

The applicability and scalability of probabilistic inference in deep-learning-assisted geophysical inversion applications

Izzatullah, Muhammad 04 1900 (has links)
Probabilistic inference, especially in the Bayesian framework, is a foundation for quantifying uncertainties in geophysical inversion applications. However, due to the presence of high-dimensional datasets and the large-scale nature of geophysical inverse problems, the applicability and scalability of probabilistic inference face significant challenges for such applications. This thesis is dedicated to improving the probabilistic inference algorithms' scalability and demonstrating their applicability for large-scale geophysical inversion applications. In this thesis, I delve into three leading applied approaches in computing the Bayesian posterior distribution in geophysical inversion applications: Laplace's approximation, Markov chain Monte Carlo (MCMC), and variational Bayesian inference. The first approach, Laplace's approximation, is the simplest form of approximation for intractable Bayesian posteriors. However, its accuracy relies on the estimation of the posterior covariance matrix. I study the visualization of the misfit landscape in low-dimensional subspace and the low-rank approximations of the covariance for full waveform inversion (FWI). I demonstrate that a non-optimal Hessian's eigenvalues truncation for the low-rank approximation will affect the approximation accuracy of the standard deviation, leading to a biased statistical conclusion. Furthermore, I also demonstrate the propagation of uncertainties within the Bayesian physics-informed neural networks for hypocenter localization applications through this approach. For the MCMC approach, I develop approximate Langevin MCMC algorithms that provide fast sampling at efficient computational costs for large-scale Bayesian FWI; however, this inflates the variance due to asymptotic bias. To account for this asymptotic bias and assess their sample quality, I introduce the kernelized Stein discrepancy (KSD) as a diagnostic tool. When larger computational resources are available, exact MCMC algorithms (i.e., with a Metropolis-Hastings criterion) should be favored for an accurate posterior distribution statistical analysis. For the variational Bayesian inference, I propose a regularized variational inference framework that performs posterior inference by implicitly regularizing the Kullback-Leibler divergence loss with a deep denoiser through a Plug-and-Play method. I also developed Plug-and-Play Stein Variational Gradient Descent (PnP-SVGD), a novel algorithm to sample the regularized posterior distribution. The PnP-SVGD demonstrates its ability to produce high-resolution, trustworthy samples representative of the subsurface structures for a post-stack seismic inversion application.
264

Predicting tumour growth-driving interactions from transcriptomic data using machine learning

Stigenberg, Mathilda January 2023 (has links)
The mortality rate is high for cancer patients and treatments are only efficient in a fraction of patients. To be able to cure more patients, new treatments need to be invented. Immunotherapy activates the immune system to fight against cancer and one treatment targets immune checkpoints. If more targets are found, more patients can be treated successfully. In this project, interactions between immune and cancer cells that drive tumour growth were investigated in an attempt to find new potential targets. This was achieved by creating a machine learning model that finds genes expressed in cells involved in tumour-driving interactions. Single-cell RNA sequencing and spatial transcriptomic data from breast cancer patients were utilised as well as single-cell RNA sequencing data from healthy patients. The tumour rate was based on the cumulative expression of G2/M genes. The G2/M related genes were excluded from the analysis since these were assumed to be cell cycle genes. The machine learning model was based on a supervised variational autoencoder architecture. By using this kind of architecture, it was possible to compress the input into a low dimensional space of genes, called a latent space, which was able to explain the tumour rate. Optuna hyperparameter optimizer framework was utilised to find the best combination of hyperparameters for the model. The model had a R2 score of 0.93, which indicated that the latent space was able to explain the growth rate 93% accurately. The latent space consisted of 20 variables. To find out which genes that were in this latent space, the correlation between each latent variable and each gene was calculated. The genes that were positively correlated or negatively correlated were assumed to be in the latent space and therefore involved in explaining tumour growth. Furthermore, the correlation between each latent variable and the growth rate was calculated. The up- and downregulated genes in each latent variable were kept and used for finding out the pathways for the different latent variables. Five of these latent variables were involved in immune responses and therefore these were further investigated. The genes in these five latent variables were mapped to cell types. One of these latent variables had upregulated immune response for positively correlated growth, indicating that immune cells were involved in promoting cancer progression. Another latent variable had downregulated immune response for negatively correlated growth. This indicated that if these genes would be upregulated instead, the tumour would be thriving. The genes found in these latent variables were analysed further. CD80, CSF1, CSF1R, IL26, IL7, IL34 and the protein NF-kappa-B were interesting finds and are known immune-modulators. These could possibly be used as markers for pro-tumour immunity. Furthermore, CSF1, CSF1R, IL26, IL34 and the protein NF-kappa-B could potentially be targeted in immunotherapy.
265

Automatic Question Paraphrasing in Swedish with Deep Generative Models / Automatisk frågeparafrasering på svenska med djupa generativa modeller

Lindqvist, Niklas January 2021 (has links)
Paraphrase generation refers to the task of automatically generating a paraphrase given an input sentence or text. Paraphrase generation is a fundamental yet challenging natural language processing (NLP) task and is utilized in a variety of applications such as question answering, information retrieval, conversational systems etc. In this study, we address the problem of paraphrase generation of questions in Swedish by evaluating two different deep generative models that have shown promising results on paraphrase generation of questions in English. The first model is a Conditional Variational Autoencoder (C-VAE) and the other model is an extension of the first one where a discriminator network is introduced into the model to form a Generative Adversarial Network (GAN) architecture. In addition to these models, a method not based on machine-learning was implemented to act as a baseline. The models were evaluated using both quantitative and qualitative measures including grammatical correctness and equivalence to source question. The results show that the deep generative models outperformed the baseline across all quantitative metrics. Furthermore, from the qualitative evaluation it was shown that the deep generative models outperformed the baseline at generating grammatically correct sentences, but there was no noticeable difference in terms of equivalence to the source question between the models. / Parafrasgenerering syftar på uppgiften att, utifrån en given mening eller text, automatiskt generera en parafras, det vill säga en annan text med samma betydelse. Parafrasgenerering är en grundläggande men ändå utmanande uppgift inom naturlig språkbehandling och används i en rad olika applikationer som informationssökning, konversionssystem, att besvara frågor givet en text etc. I den här studien undersöker vi problemet med parafrasgenerering av frågor på svenska genom att utvärdera två olika djupa generativa modeller som visat lovande resultat på parafrasgenerering av frågor på engelska. Den första modellen är en villkorsbaserad variationsautokodare (C-VAE). Den andra modellen är också en C-VAE men introducerar även en diskriminator vilket gör modellen till ett generativt motståndarnätverk (GAN). Förutom modellerna presenterade ovan, implementerades även en icke maskininlärningsbaserad metod som en baslinje. Modellerna utvärderades med både kvantitativa och kvalitativa mått inklusive grammatisk korrekthet och likvärdighet mellan parafras och originalfråga. Resultaten visar att de djupa generativa modellerna presterar bättre än baslinjemodellen på alla kvantitativa mätvärden. Vidare, visade the kvalitativa utvärderingen att de djupa generativa modellerna kunde generera grammatiskt korrekta frågor i större utsträckning än baslinjemodellen. Det var däremot ingen större skillnad i semantisk ekvivalens mellan parafras och originalfråga för de olika modellerna.
266

Bayesian Neural Networks for Financial Asset Forecasting / Bayesianska neurala nätverk för prediktion av finansiella tillgångar

Back, Alexander, Keith, William January 2019 (has links)
Neural networks are powerful tools for modelling complex non-linear mappings, but they often suffer from overfitting and provide no measures of uncertainty in their predictions. Bayesian techniques are proposed as a remedy to these problems, as these both regularize and provide an inherent measure of uncertainty from their posterior predictive distributions. By quantifying predictive uncertainty, we attempt to improve a systematic trading strategy by scaling positions with uncertainty. Exact Bayesian inference is often impossible, and approximate techniques must be used. For this task, this thesis compares dropout, variational inference and Markov chain Monte Carlo. We find that dropout and variational inference provide powerful regularization techniques, but their predictive uncertainties cannot improve a systematic trading strategy. Markov chain Monte Carlo provides powerful regularization as well as promising estimates of predictive uncertainty that are able to improve a systematic trading strategy. However, Markov chain Monte Carlo suffers from an extreme computational cost in the high-dimensional setting of neural networks. / Neurala nätverk är kraftfulla verktyg för att modellera komplexa icke-linjära avbildningar, men de lider ofta av överanpassning och tillhandahåller inga mått på osäkerhet i deras prediktioner. Bayesianska tekniker har föreslagits för att råda bot på dessa problem, eftersom att de både har en regulariserande effekt, samt har ett inneboende mått på osäkerhet genom den prediktiva posteriora fördelningen. Genom att kvantifiera prediktiv osäkerhet försöker vi förbättra en systematisk tradingstrategi genom att skala modellens positioner med den skattade osäkerheten. Exakt Bayesiansk inferens är oftast omöjligt, och approximativa metoder måste användas. För detta ändamål jämför detta examensarbete dropout, variational inference och Markov chain Monte Carlo. Resultaten indikerar att både dropout och variational inference är kraftfulla regulariseringstekniker, men att deras prediktiva osäkerheter inte kan användas för att förbättra en systematisk tradingstrategi. Markov chain Monte Carlo ger en kraftfull regulariserande effekt, samt lovande skattningar av osäkerhet som kan användas för att förbättra en systematisk tradingstrategi. Dock lider Markov chain Monte Carlo av en enorm beräkningsmässig komplexitet i ett så högdimensionellt problem som neurala nätverk.
267

Deep Scenario Generation of Financial Markets / Djup scenario generering av finansiella marknader

Carlsson, Filip, Lindgren, Philip January 2020 (has links)
The goal of this thesis is to explore a new clustering algorithm, VAE-Clustering, and examine if it can be applied to find differences in the distribution of stock returns and augment the distribution of a current portfolio of stocks and see how it performs in different market conditions. The VAE-clustering method is as mentioned a newly introduced method and not widely tested, especially not on time series. The first step is therefore to see if and how well the clustering works. We first apply the algorithm to a dataset containing monthly time series of the power demand in Italy. The purpose in this part is to focus on how well the method works technically. When the model works well and generates proper results with the Italian Power Demand data, we move forward and apply the model on stock return data. In the latter application we are unable to find meaningful clusters and therefore unable to move forward towards the goal of the thesis. The results shows that the VAE-clustering method is applicable for time series. The power demand have clear differences from season to season and the model can successfully identify those differences. When it comes to the financial data we hoped that the model would be able to find different market regimes based on time periods. The model is though not able distinguish different time periods from each other. We therefore conclude that the VAE-clustering method is applicable on time series data, but that the structure and setting of the financial data in this thesis makes it to hard to find meaningful clusters. The major finding is that the VAE-clustering method can be applied to time series. We highly encourage further research to find if the method can be successfully used on financial data in different settings than tested in this thesis. / Syftet med den här avhandlingen är att utforska en ny klustringsalgoritm, VAE-Clustering, och undersöka om den kan tillämpas för att hitta skillnader i fördelningen av aktieavkastningar och förändra distributionen av en nuvarande aktieportfölj och se hur den presterar under olika marknadsvillkor. VAE-klusteringsmetoden är som nämnts en nyinförd metod och inte testad i stort, särskilt inte på tidsserier. Det första steget är därför att se om och hur klusteringen fungerar. Vi tillämpar först algoritmen på ett datasätt som innehåller månatliga tidsserier för strömbehovet i Italien. Syftet med denna del är att fokusera på hur väl metoden fungerar tekniskt. När modellen fungerar bra och ger tillfredställande resultat, går vi vidare och tillämpar modellen på aktieavkastningsdata. I den senare applikationen kan vi inte hitta meningsfulla kluster och kan därför inte gå framåt mot målet som var att simulera olika marknader och se hur en nuvarande portfölj presterar under olika marknadsregimer. Resultaten visar att VAE-klustermetoden är väl tillämpbar på tidsserier. Behovet av el har tydliga skillnader från säsong till säsong och modellen kan framgångsrikt identifiera dessa skillnader. När det gäller finansiell data hoppades vi att modellen skulle kunna hitta olika marknadsregimer baserade på tidsperioder. Modellen kan dock inte skilja olika tidsperioder från varandra. Vi drar därför slutsatsen att VAE-klustermetoden är tillämplig på tidsseriedata, men att strukturen på den finansiella data som undersöktes i denna avhandling gör det svårt att hitta meningsfulla kluster. Den viktigaste upptäckten är att VAE-klustermetoden kan tillämpas på tidsserier. Vi uppmuntrar ytterligare forskning för att hitta om metoden framgångsrikt kan användas på finansiell data i andra former än de testade i denna avhandling
268

Modelling approach and avoidance behaviour : A deep learning approach to understand the human olfactory system / Modellering av beteende för närmande och frånstötning : En djupinlärningsapproach för att förstå det mänskliga luktsystemet

Nordén, Frans January 2021 (has links)
In this thesis we examine the question whether it is possible to model approach and avoidance behaviour with probabilistic machine learning. The results from this project will primarily aid in our collective understanding of human existence. Secondly, it will extend the knowledge with regards to probabilistic machine learning in the Neuroscience domain. We aid this through building a Variational Recurrent Neural Network (VRNN) that is trained on Electroencephalography (EEG)-data from participants that is subjected to odours with varying pleasantness. The pleasantness of the odours is used to divide the participants into two classes based on their self reported experience. This data is used to train the VRNN. The performance of the VRNN is evaluated by how well we are able to reconstruct the original data from a low dimensional latent representation. In this task the model performs on a similar level as related works. We further investigate how changes in the latent space effects reconstructed data. Despite being disentangled, the latent variables are hard to interpret. Furthermore we try to classify and cluster the latent space as either approach or avoidance behaviour with a Support Vector Machine and Uniform Manifold Approximation. The classification results are only slightly better than random, indicating that the learned latent space is not suitable for the task This is most likely due to the patterns that make up approach and avoidance behaviour is seen as noise by the VRNN. This leads to the patterns not being accurately modelled. This is shown by the evidence that frontal α -asymmetry that exists in the data is not reconstructed by the model. The conclusion is therefore that a VRNN is less suitable for modelling underlying behaviour from raw EEG data due to the low signal to noise ratio. We instead suggests to focus on specific frequency ranges in specific regions when applying machine learning in this domain. / Den här uppsatsen behandlar frågan huruvida det är möjligt att modellera närmande och frånstötande beteendemönster med hjälp av maskininlärning. Resultaten från detta projekt ämnar huvudsakligen att främja vidare förståelse av den mänskliga existensen. Vidare ämnar den även att utvidga förståelsen av hur probabilistisk maskininlärning kan användas för att utforska dylika hänseenden. Vi genomför detta genom att bygga en Variational Recurrent Neural Network-modell (VRNN) som tränas på data från experiment där personer utsätts för olika lukter samtidigt som deras Elektroencefalografi (EEG) spelas in. Deltagarna delas in i två klasser beroende på deras självrapporterade upplevelse av luktens njutbarhet. Maskininlärningsmodellen utvärderas genom att vi analyserar hur väl den lyckas rekonstruera datan. Detta lyckas den väl med. Vidare så undersöker vi hur förändringar i modellens latenta rum påverkar rekonstrueringen av datan. Resultaten från det experimentet är ej tydliga. Vidare så försöker vi klassificera och klustra det latenta rummet med avseende på närmande och frånstötande beteende med hjälp av en Support Vector Machine och Uniform Manifold Approximation. Resultaten från dessa experiment är att vi inte lyckas klassificera eller klustra det latenta rummet med avseende på närmande och frånstötande beteende bättre än slumpen. Vi argumenterar för att detta beror på att de underliggande mönster som skapar dessa beteenden ses som brus av VRNN-modellen och därmed inte modelleras. Detta visas genom att frontal α-asymmetri som existerar i datan ej rekonstrueras av modellen. Slutsaten blir därmed att en VRNN är mindre passande att använda vid modellering av underliggande beteenden av obehandlad EEG data. Detta på grund av det låga signal till brus-förhållandet i EEG-datan. Vi föreslår att istället fokusera på specifika frekvensområden i specifika hjärnregioner när maskininlärning appliceras på EEG.
269

Neural Ordinary Differential Equations for Anomaly Detection / : Neurala Ordinära Differentialekvationer för Anomalidetektion

Hlöðver Friðriksson, Jón, Ågren, Erik January 2021 (has links)
Today, a large amount of time series data is being produced from a variety of different devices such as smart speakers, cell phones and vehicles. This data can be used to make inferences and predictions. Neural network based methods are among one of the most popular ways to model time series data. The field of neural networks is constantly expanding and new methods and model variants are frequently introduced. In 2018, a new family of neural networks was introduced. Namely, Neural Ordinary Differential Equations (Neural ODEs). Neural ODEs have shown great potential in modelling the dynamics of temporal data. Here we present an investigation into using Neural Ordinary Differential Equations for anomaly detection. We tested two model variants, LSTM-ODE and latent-ODE. The former model utilises a neural ODE to model the continuous-time hidden state in between observations of an LSTM model, the latter is a variational autoencoder that uses the LSTM-ODE as encoding and a Neural ODE as decoding. Both models are suited for modelling sparsely and irregularly sampled time series data. Here, we test their ability to detect anomalies on various sparsity and irregularity ofthe data. The models are compared to a Gaussian mixture model, a vanilla LSTM model and an LSTM variational autoencoder. Experimental results using the Human Activity Recognition dataset showed that the Neural ODEbased models obtained a better ability to detect anomalies compared to their LSTM based counterparts. However, the computational training cost of the Neural ODE models were considerably higher than for the models that onlyutilise the LSTM architecture. The Neural ODE based methods were also more memory consuming than their LSTM counterparts. / Idag produceras en stor mängd tidsseriedata från en mängd olika enheter som smarta högtalare, mobiltelefoner och fordon. Denna datan kan användas för att dra slutsatser och förutsägelser. Neurala nätverksbaserade metoder är bland de mest populära sätten att modellera tidsseriedata. Mycket forskning inom området neurala nätverk pågår och nya metoder och modellvarianter introduceras ofta. Under 2018 introducerades en ny familj av neurala nätverk. Nämligen, Neurala Ordinära Differentialekvationer (NeuralaODE:er). Neurala ODE:er har visat en stor potential i att modellera dynamiken hos temporal data. Vi presenterar här en undersökning i att använda neuralaordinära differentialekvationer för anomalidetektion. Vi testade två olika modellvarianter, en som kallas LSTM-ODE och en annan som kallas latent-ODE.Den förstnämnda använder Neurala ODE:er för att modellera det kontinuerliga dolda tillståndet mellan observationer av en LSTM-modell, den andra är en variational autoencoder som använder LSTM-ODE som kodning och en Neural ODE som avkodning. Båda dessa modeller är lämpliga för att modellera glest och oregelbundet samplade tidsserier. Därför testas deras förmåga att upptäcka anomalier på olika gleshet och oregelbundenhet av datan. Modellerna jämförs med en gaussisk blandningsmodell, en vanlig LSTM modell och en LSTM variational autoencoder. Experimentella resultat vid användning av datasetet Human Activity Recognition (HAR) visade att de Neurala ODE-baserade modellerna erhöll en bättre förmåga att upptäcka avvikelser jämfört med deras LSTM-baserade motsvarighet. Träningstiden förde Neurala ODE-baserade modellerna var dock betydligt långsammare än träningstiden för deras LSTM-baserade motsvarighet. Neurala ODE-baserade metoder krävde också mer minnesanvändning än deras LSTM motsvarighet.
270

Robust spatio-temporal latent variable models

Christmas, Jacqueline January 2011 (has links)
Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are widely-used mathematical models for decomposing multivariate data. They capture spatial relationships between variables, but ignore any temporal relationships that might exist between observations. Probabilistic PCA (PPCA) and Probabilistic CCA (ProbCCA) are versions of these two models that explain the statistical properties of the observed variables as linear mixtures of an alternative, hypothetical set of hidden, or latent, variables and explicitly model noise. Both the noise and the latent variables are assumed to be Gaussian distributed. This thesis introduces two new models, named PPCA-AR and ProbCCA-AR, that augment PPCA and ProbCCA respectively with autoregressive processes over the latent variables to additionally capture temporal relationships between the observations. To make PPCA-AR and ProbCCA-AR robust to outliers and able to model leptokurtic data, the Gaussian assumptions are replaced with infinite scale mixtures of Gaussians, using the Student-t distribution. Bayesian inference calculates posterior probability distributions for each of the parameter variables, from which we obtain a measure of confidence in the inference. It avoids the pitfalls associated with the maximum likelihood method: integrating over all possible values of the parameter variables guards against overfitting. For these new models the integrals required for exact Bayesian inference are intractable; instead a method of approximation, the variational Bayesian approach, is used. This enables the use of automatic relevance determination to estimate the model orders. PPCA-AR and ProbCCA-AR can be viewed as linear dynamical systems, so the forward-backward algorithm, also known as the Baum-Welch algorithm, is used as an efficient method for inferring the posterior distributions of the latent variables. The exact algorithm is tractable because Gaussian assumptions are made regarding the distribution of the latent variables. This thesis introduces a variational Bayesian forward-backward algorithm based on Student-t assumptions. The new models are demonstrated on synthetic datasets and on real remote sensing and EEG data.

Page generated in 0.0854 seconds