Spelling suggestions: "subject:"bayesian 1earning"" "subject:"bayesian c1earning""
31 |
Statistical physics of disordered networks - Spin Glasses on hierarchical lattices and community inference on random graphs / Physique statistique des réseaux désordonnées - Verres de spin sur réseaux hiérarchique et inférence de modules dans les graphes aléatoiresDecelle, Aurélien 11 October 2011 (has links)
Cette thèse aborde des aspects fondamentales et appliquées de la théorie des verres de spin etplus généralement des systèmes complexes. Les premiers modèles théoriques décrivant la transitionvitreuse sont apparues dans les années 1970. Ceux-ci décrivaient les verres à l'aide d'interactionsaléatoires. Il a fallu alors plusieurs années avant qu'une théorie de champs moyen pour ces systèmessoient comprises. De nos jours il existe un grand nombre de modèles tombant dans la classe de« champs moyen » et qui sont bien compris à la fois analytiquement, mais également numériquementgrâce à des outils tels que le monte-carlo ou la méthode de la cavité. Par ailleurs il est bien connu quele groupe de renormalisation a échoué jusque ici à pouvoir prédire le comportement des observablescritiques dans les verres hors champs moyen. Nous avons donc choisi d'étudier des systèmes eninteraction à longue portée dont on ignore encore si la physique est identique à celle du champmoyen. Nous avons montré dans une première partie, la facilité avec laquelle on peut décrire unetransformation du groupe de renormalisation dans les systèmes ferromagnétiques en interaction àlongue portée dé finies sur le réseau hiérarchique de Dyson. Dans un second temps, nous avons portéenotre attention sur des modèles de verre de spin sur ce même réseau. Un début d'analyse sur cestransformations dans l'espace réel est présenté ainsi qu'une comparaison de la mesure de l'exposantcritique nu par différentes méthodes. Si la transformation décrite semble prometteuse il faut cependantnoter que celle-ci doit encore être améliorée afin d'être considérée comme une méthode valide pournotre système. Nous avons continué dans cette même direction en analysant un modèle d'énergiesaléatoires toujours en utilisant la topologie du réseau hiérarchique. Nous avons étudié numériquementce système dans lequel nous avons pu observer l'existence d'une transition de phase de type « criseentropique » tout à fait similaire à celle du REM de Derrida. Toutefois, notre modèle présente desdifférences importantes avec ce dernier telles que le comportement non-analytique de l'entropie à latransition, ainsi que l'émergence de « criticalité » dont la présence serait à confirmer par d'autres études.Nous montrons également à l'aide de notre méthode numérique comment la température critique dece système peut-être estimée de trois façon différentes.Dans une dernière partie nous avons abordé des problèmes liés aux systèmes complexes. Il aété remarqué récemment que les modèles étudiés dans divers domaines, par exemple la physique, labiologie ou l'informatique, étaient très proches les uns des autres. Ceci est particulièrement vrai dansl'optimisation combinatoire qui a en partie été étudiée par des méthodes de physique statistique. Cesméthodes issues de la théories des verres de spin et des verres structuraux ont été très utilisées pourétudier les transitions de phase qui ont lieux dans ces systèmes ainsi que pour inventer de nouveauxalgorithmes pour ces modèles. Nous avons étudié le problème de l'inférence de modules dans lesréseaux à l'aide de ces même méthodes. Nous présentons une analyse sur la détection des modules topologiques dans des réseaux aléatoires et démontrons la présence d'une transition de phase entre une région où ces modules sont indétectables et une région où ils sont détectables. Par ailleurs, nous avons implémenté pour ces problèmes un algorithme utilisant Belief Propagation afin d'inférer les modules ainsi que d'apprendre leurs propriétés en ayant pour unique information la structure du réseau. Finalementnous avons appliqué cet algorithme sur des réseaux construits à partir de données réelles et discutonsles développements à apporter à notre méthode. / This thesis presents fundamental and applied aspects of spin glasses theory and complex systems. The first theoretical models of spin glasses appeared during the 1970. They were modelling glassy systems by using random interactions. It took several years before a mean-field theory of spin glasses was solved and understood. Nowadays there exists many different models falling in the class of mean-field models. They are well-understood analytically but also numerically where many methods exist to analyse them, namely the monte-carlo and the cavity method which are now essential numerical tools to investigate spin glass. At the same time, the renormalisation group technique which has been very useful in the past to analyse second order transition failed in many disordered systems to predict the behaviour of critical observables in non-mean-field spin glasses. We have chosen to study long-range interacting systems in which we don't know if the physics is identical to mean-field models. In a first part, we studied a ferromagnetic model on the Dyson hierarchical lattice. In this system with long-range interaction, we showed that it is easy to find a real-space transformation of the renormalisation group to compute the critical exponents. In a second part we focused on a spin glass model built on the same lattice. We made a first study where a real-space transformation is described for this system and we compare the estimations of the critical exponent nu for this model by different methods. The renormalisation group transformation gives some encouraging results but needs to be improved to become a more reliable method in this system. We have then investigated a model of random energies by using the same hierarchical topology. We studied numerically this system where we observed the existence of a phase transition of the same type as the one present in the REM of Derrida. However our model exhibits many different features compare to the REM. We found a non-analytical behaviour of the entropy at the transition and critical properties such as a diverging length-scale should occur according to our results. This last prediction has to be studied by a more direct measurement. By the numerical method we developed, we estimated the critical temperature using three different observables, all giving the same value. In the last part I turned to problems related to complex systems. It has been noticed recently that models of different fields such as physics, biology or computer science were very close to each other. This is particularly true in combinatorial optimisation problem which has been investigated using method of statistical physics. These techniques coming from the field of spin glasses and structural glasses were used to studied phase transitions in such systems and to invent new algorithms. We studied the problem of inference and learning of modular structure in random graphs by these techniques. We analysed the presence of topological clusters in some particular types of random graphs, and we showed that a phase transition occurred between a region where it is possible to detect clusters and a region where it is impossible. We also implemented a new algorithm using Belief Propagation to learn the properties of these clusters and to infer them in networks. We applied this algorithm to real-graph and discussed further development of this problem.
|
32 |
Interpretable machine learning for additive manufacturingRaquel De Souza Borges Ferreira (6386963) 10 June 2019 (has links)
<div>This dissertation addresses two significant issues in the effective application of machine learning algorithms and models for the physical and engineering sciences. The first is the broad challenge of automated modeling of data across different processes in a physical system. The second is the dilemma of obtaining insightful interpretations on the relationships between the inputs and outcome of a system as inferred from complex, black box machine learning models.</div><div><br></div><div><b>Automated Geometric Shape Deviation Modeling for Additive Manufacturing Systems</b></div><div><b><br></b></div><div>Additive manufacturing systems possess an intrinsic capability for one-of-a-kind manufacturing of a vast variety of shapes across a wide spectrum of processes. One major issue in AM systems is geometric accuracy control for the inevitable shape deviations that arise in AM processes. Current effective approaches for shape deviation control in AM involve the specification of statistical or machine learning deviation models for additively manufactured products. However, this task is challenging due to the constraints on the number of test shapes that can be manufactured in practice, and limitations on user efforts that can be devoted for learning deviation models across different shape classes and processes in an AM system. We develop an automated, Bayesian neural network methodology for comprehensive shape deviation modeling in an AM system. A fundamental innovation in this machine learning method is our new and connectable neural network structures that facilitate the transfer of prior knowledge and models on deviations across different shape classes and AM processes. Several case studies on in-plane and out-of-plane deviations, regular and free-form shapes, and different settings of lurking variables serve to validate the power and broad scope of our methodology, and its potential to advance high-quality manufacturing in an AM system.</div><div><br></div><div><b>Interpretable Machine Learning</b></div><div><b><br></b></div><div>Machine learning algorithms and models constitute the dominant set of predictive methods for a wide range of complex, real-world processes. However, interpreting what such methods effectively infer from data is difficult in general. This is because their typical black box natures possess a limited ability to directly yield insights on the underlying relationships between inputs and the outcome for a process. We develop methodologies based on new predictive comparison estimands that effectively enable one to ``mine’’ machine learning models, in the sense of (a) interpreting their inferred associations between inputs and/or functional forms of inputs with the outcome, (b) identifying the inputs that they effectively consider relevant, and (c) interpreting the inferred conditional and two-way associations of the inputs with the outcome. We establish Fisher consistent estimators, and their corresponding standard errors, for our new estimands under a condition on the inputs' distributions. The significance of our predictive comparison methodology is demonstrated with a wide range of simulation and case studies that involve Bayesian additive regression trees, neural networks, and support vector machines. Our extended study of interpretable machine learning for AM systems demonstrates how our method can contribute to smarter advanced manufacturing systems, especially as current machine learning methods for AM are lacking in their ability to yield meaningful engineering knowledge on AM processes. <br></div>
|
33 |
Bayesian Framework for Sparse Vector Recovery and Parameter Bounds with Application to Compressive SensingJanuary 2019 (has links)
abstract: Signal compressed using classical compression methods can be acquired using brute force (i.e. searching for non-zero entries in component-wise). However, sparse solutions require combinatorial searches of high computations. In this thesis, instead, two Bayesian approaches are considered to recover a sparse vector from underdetermined noisy measurements. The first is constructed using a Bernoulli-Gaussian (BG) prior distribution and is assumed to be the true generative model. The second is constructed using a Gamma-Normal (GN) prior distribution and is, therefore, a different (i.e. misspecified) model. To estimate the posterior distribution for the correctly specified scenario, an algorithm based on generalized approximated message passing (GAMP) is constructed, while an algorithm based on sparse Bayesian learning (SBL) is used for the misspecified scenario. Recovering sparse signal using Bayesian framework is one class of algorithms to solve the sparse problem. All classes of algorithms aim to get around the high computations associated with the combinatorial searches. Compressive sensing (CS) is a widely-used terminology attributed to optimize the sparse problem and its applications. Applications such as magnetic resonance imaging (MRI), image acquisition in radar imaging, and facial recognition. In CS literature, the target vector can be recovered either by optimizing an objective function using point estimation, or recovering a distribution of the sparse vector using Bayesian estimation. Although Bayesian framework provides an extra degree of freedom to assume a distribution that is directly applicable to the problem of interest, it is hard to find a theoretical guarantee of convergence. This limitation has shifted some of researches to use a non-Bayesian framework. This thesis tries to close this gab by proposing a Bayesian framework with a suggested theoretical bound for the assumed, not necessarily correct, distribution. In the simulation study, a general lower Bayesian Cram\'er-Rao bound (BCRB) bound is extracted along with misspecified Bayesian Cram\'er-Rao bound (MBCRB) for GN model. Both bounds are validated using mean square error (MSE) performances of the aforementioned algorithms. Also, a quantification of the performance in terms of gains versus losses is introduced as one main finding of this report. / Dissertation/Thesis / Masters Thesis Computer Engineering 2019
|
34 |
Multi-risk modeling for improved agriculture decision-support: predicting crop yield variability and gaps due to climate variability, extreme events, and diseaseLu, Weixun 15 September 2020 (has links)
The agriculture sectors in Canada are highly vulnerable to a wide range of inter-related weather risks linked to seasonal climate variability (e.g., El Ni ̃no Southern Oscillation(ENSO)), short-term extreme weather events (e.g., heatwaves), and emergent disease(e.g., grape powdery mildew). All of these weather-related risks can cause severe crop losses to agricultural crop yield and crop quality as Canada grows a wide range of farm products, and the changing weather conditions mainly drive farming practices. This dissertation presents three machine learning-based statistical models to assess the weather risks on the Canadian agriculture regions and to provide reliable risk forecasting to improve the decision-making of Canadian agricultural producers in farming practices. The first study presents a multi-scale, cluster-based Principal Component Analysis(PCA) approach to assess the potential seasonal impacts of ENSO to spring wheat and barley on agricultural census regions across the Canada prairies areas. Model prediction skills for annual wheat and barley yield have examined in multi-scale from spatial cluster approaches. The ’best’ spatial models were used to define spatial patterns of ENSO forcing on wheat and barley yields. The model comparison of our spatial model to non-spatial models shows spatial clustering and ENSO forcing have increase model performance of prediction skills in forecasting future cereal crop production. The second study presents a copula-Bayesian network approach to assess the impact of extreme high-temperature events (heatwave events) on the developments of regional crops across the Canada agricultural regions at the eco-district-scale. Relevantweather variables and heatwave variables during heatwave periods have identified and used as input variables for model learning. Both a copula-Bayesian network and Gaussian-based network modeling approach is evaluated and inter-compared. The copula approach based on ’vine copulas’ generated the most accurate predictions of heatwave occurrence as a driver of crop heat stress. The last study presents a stochastic, hybrid-Bayesian machine-learning approach to explore the complex causal relationships between weather, pathogen, and host for grape powdery mildew in an experimental farm in Quebec, Canada. This study explores a high-performance network model for daily disease risk forecast by using estimated development factors of pathogen and host from recorded daily weather variables. A fungicide strategy for disease control has presented by using the model outputs and forecasted future weather variability. The dissertation findings are beneficial to Canada’s agricultural sector. The inter-related weather risks explored by the three separate studies in multi-scales provide a better understanding of the interactions between changing weather conditions, extreme weather, and crop production. The research showcases new insights, methods, and tools for minimizing risk in agricultural decision-making / Graduate / 2021-08-19
|
35 |
Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and BeyondMaroñas Molano, Juan 28 February 2022 (has links)
[ES] Esta tesis se enmarca en la intersección entre las técnicas modernas de Machine Learning, como las Redes Neuronales Profundas, y el modelado probabilístico confiable. En muchas aplicaciones, no solo nos importa la predicción hecha por un modelo (por ejemplo esta imagen de pulmón presenta cáncer) sino también la confianza que tiene el modelo para hacer esta predicción (por ejemplo esta imagen de pulmón presenta cáncer con 67% probabilidad). En tales aplicaciones, el modelo ayuda al tomador de decisiones (en este caso un médico) a tomar la decisión final. Como consecuencia, es necesario que las probabilidades proporcionadas por un modelo reflejen las proporciones reales presentes en el conjunto al que se ha asignado dichas probabilidades; de lo contrario, el modelo es inútil en la práctica. Cuando esto sucede, decimos que un modelo está perfectamente calibrado.
En esta tesis se exploran tres vias para proveer modelos más calibrados. Primero se muestra como calibrar modelos de manera implicita, que son descalibrados por técnicas de aumentación de datos. Se introduce una función de coste que resuelve esta descalibración tomando como partida las ideas derivadas de la toma de decisiones con la regla de Bayes. Segundo, se muestra como calibrar modelos utilizando una etapa de post calibración implementada con una red neuronal Bayesiana. Finalmente, y en base a las limitaciones estudiadas en la red neuronal Bayesiana, que hipotetizamos que se basan en un prior mispecificado, se introduce un nuevo proceso estocástico que sirve como distribución a priori en un problema de inferencia Bayesiana. / [CA] Aquesta tesi s'emmarca en la intersecció entre les tècniques modernes de Machine Learning, com ara les Xarxes Neuronals Profundes, i el modelatge probabilístic fiable. En moltes aplicacions, no només ens importa la predicció feta per un model (per ejemplem aquesta imatge de pulmó presenta càncer) sinó també la confiança que té el model per fer aquesta predicció (per exemple aquesta imatge de pulmó presenta càncer amb 67% probabilitat). En aquestes aplicacions, el model ajuda el prenedor de decisions (en aquest cas un metge) a prendre la decisió final. Com a conseqüència, cal que les probabilitats proporcionades per un model reflecteixin les proporcions reals presents en el conjunt a què s'han assignat aquestes probabilitats; altrament, el model és inútil a la pràctica. Quan això passa, diem que un model està perfectament calibrat.
En aquesta tesi s'exploren tres vies per proveir models més calibrats. Primer es mostra com calibrar models de manera implícita, que són descalibrats per tècniques d'augmentació de dades. S'introdueix una funció de cost que resol aquesta descalibració prenent com a partida les idees derivades de la presa de decisions amb la regla de Bayes. Segon, es mostra com calibrar models utilitzant una etapa de post calibratge implementada amb una xarxa neuronal Bayesiana. Finalment, i segons les limitacions estudiades a la xarxa neuronal Bayesiana, que es basen en un prior mispecificat, s'introdueix un nou procés estocàstic que serveix com a distribució a priori en un problema d'inferència Bayesiana. / [EN] This thesis is framed at the intersection between modern Machine Learning techniques, such as Deep Neural Networks, and reliable probabilistic modeling. In many machine learning applications, we do not only care about the prediction made by a model (e.g. this lung image presents cancer) but also in how confident is the model in making this prediction (e.g. this lung image presents cancer with 67% probability). In such applications, the model assists the decision-maker (in this case a doctor) towards making the final decision. As a consequence, one needs that the probabilities provided by a model reflects the true underlying set of outcomes, otherwise the model is useless in practice. When this happens, we say that a model is perfectly calibrated.
In this thesis three ways are explored to provide more calibrated models. First, it is shown how to calibrate models implicitly, which are decalibrated by data augmentation techniques. A cost function is introduced that solves this decalibration taking as a starting point the ideas derived from decision making with Bayes' rule. Second, it shows how to calibrate models using a post-calibration stage implemented with a Bayesian neural network. Finally, and based on the limitations studied in the Bayesian neural network, which we hypothesize that came from a mispecified prior, a new stochastic process is introduced that serves as a priori distribution in a Bayesian inference problem. / Maroñas Molano, J. (2022). Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181582
|
36 |
Sparse Bayesian Learning For Joint Channel Estimation Data Detection In OFDM SystemsPrasad, Ranjitha January 2015 (has links) (PDF)
Bayesian approaches for sparse signal recovery have enjoyed a long-standing history in signal processing and machine learning literature. Among the Bayesian techniques, the expectation maximization based Sparse Bayesian Learning(SBL) approach is an iterative procedure with global convergence guarantee to a local optimum, which uses a parameterized prior that encourages sparsity under an evidence maximization frame¬work. SBL has been successfully employed in a wide range of applications ranging from image processing to communications. In this thesis, we propose novel, efficient and low-complexity SBL-based algorithms that exploit structured sparsity in the presence of fully/partially known measurement matrices. We apply the proposed algorithms to the problem of channel estimation and data detection in Orthogonal Frequency Division Multiplexing(OFDM) systems. Further, we derive Cram´er Rao type lower Bounds(CRB) for the single and multiple measurement vector SBL problem of estimating compressible vectors and their prior distribution parameters. The main contributions of the thesis are as follows:
We derive Hybrid, Bayesian and Marginalized Cram´er Rao lower bounds for the problem of estimating compressible vectors drawn from a Student-t prior distribution. We derive CRBs that encompass the deterministic or random nature of the unknown parameters of the prior distribution and the regression noise variance. We use the derived bounds to uncover the relationship between the compressibility and Mean Square Error(MSE) in the estimates. Through simulations, we demonstrate the dependence of the MSE performance of SBL based estimators on the compressibility of the vector.
OFDM is a well-known multi-carrier modulation technique that provides high spectral efficiency and resilience to multi-path distortion of the wireless channel
It is well-known that the impulse response of a wideband wireless channel is approximately sparse, in the sense that it has a small number of significant components relative to the channel delay spread. In this thesis, we consider the estimation of the unknown channel coefficients and its support in SISO-OFDM systems using a SBL framework. We propose novel pilot-only and joint channel estimation and data detection algorithms in block-fading and time-varying scenarios. In the latter case, we use a first order auto-regressive model for the time-variations, and propose recursive, low-complexity Kalman filtering based algorithms for channel estimation. Monte Carlo simulations illustrate the efficacy of the proposed techniques in terms of the MSE and coded bit error rate performance.
• Multiple Input Multiple Output(MIMO) combined with OFDM harnesses the inherent advantages of OFDM along with the diversity and multiplexing advantages of a MIMO system. The impulse response of wireless channels between the Nt transmit and Nr receive antennas of a MIMO-OFDM system are group approximately sparse(ga-sparse),i.e. ,the Nt Nr channels have a small number of significant paths relative to the channel delay spread, and the time-lags of the significant paths between transmit and receive antenna pairs coincide. Often, wire¬less channels are also group approximately-cluster sparse(ga-csparse),i.e.,every ga-sparse channel consists of clusters, where a few clusters have all strong components while most clusters have all weak components. In this thesis, we cast the problem of estimating the ga-sparse and ga-csparse block-fading and time-varying channels using a multiple measurement SBL framework. We propose a bouquet of novel algorithms for MIMO-OFDM systems that generalize the algorithms proposed in the context of SISO-OFDM systems. The efficacy of the proposed techniques are demonstrated in terms of MSE and coded bit error rate performance.
|
37 |
A Bayesian learning approach to inconsistency identification in model-based systems engineeringHerzig, Sebastian J. I. 08 June 2015 (has links)
Designing and developing complex engineering systems is a collaborative effort. In Model-Based Systems Engineering (MBSE), this collaboration is supported through the use of formal, computer-interpretable models, allowing stakeholders to address concerns using well-defined modeling languages. However, because concerns cannot be separated completely, implicit relationships and dependencies among the various models describing a system are unavoidable. Given that models are typically co-evolved and only weakly integrated, inconsistencies in the agglomeration of the information and knowledge encoded in the various models are frequently observed. The challenge is to identify such inconsistencies in an automated fashion. In this research, a probabilistic (Bayesian) approach to abductive reasoning about the existence of specific types of inconsistencies and, in the process, semantic overlaps (relationships and dependencies) in sets of heterogeneous models is presented. A prior belief about the manifestation of a particular type of inconsistency is updated with evidence, which is collected by extracting specific features from the models by means of pattern matching. Inference results are then utilized to improve future predictions by means of automated learning. The effectiveness and efficiency of the approach is evaluated through a theoretical complexity analysis of the underlying algorithms, and through application to a case study. Insights gained from the experiments conducted, as well as the results from a comparison to the state-of-the-art have demonstrated that the proposed method is a significant improvement over the status quo of inconsistency identification in MBSE.
|
38 |
Probabilistic models in noisy environments : and their application to a visual prosthesis for the blindArchambeau, Cédric 26 September 2005 (has links)
In recent years, probabilistic models have become fundamental techniques in machine learning. They are successfully applied in various engineering problems, such as robotics, biometrics, brain-computer interfaces or artificial vision, and will gain in importance in the near future. This work deals with the difficult, but common situation where the data is, either very noisy, or scarce compared to the complexity of the process to model. We focus on latent variable models, which can be formalized as probabilistic graphical models and learned by the expectation-maximization algorithm or its variants (e.g., variational Bayes).<br>
After having carefully studied a non-exhaustive list of multivariate kernel density estimators, we established that in most applications locally adaptive estimators should be preferred. Unfortunately, these methods are usually sensitive to outliers and have often too many parameters to set. Therefore, we focus on finite mixture models, which do not suffer from these drawbacks provided some structural modifications.<br>
Two questions are central in this dissertation: (i) how to make mixture models robust to noise, i.e. deal efficiently with outliers, and (ii) how to exploit side-channel information, i.e. additional information intrinsic to the data. In order to tackle the first question, we extent the training algorithms of the popular Gaussian mixture models to the Student-t mixture models. the Student-t distribution can be viewed as a heavy-tailed alternative to the Gaussian distribution, the robustness being tuned by an extra parameter, the degrees of freedom. Furthermore, we introduce a new variational Bayesian algorithm for learning Bayesian Student-t mixture models. This algorithm leads to very robust density estimators and clustering. To address the second question, we introduce manifold constrained mixture models. This new technique exploits the information that the data is living on a manifold of lower dimension than the dimension of the feature space. Taking the implicit geometrical data arrangement into account results in better generalization on unseen data.<br>
Finally, we show that the latent variable framework used for learning mixture models can be extended to construct probabilistic regularization networks, such as the Relevance Vector Machines. Subsequently, we make use of these methods in the context of an optic nerve visual prosthesis to restore partial vision to blind people of whom the optic nerve is still functional. Although visual sensations can be induced electrically in the blind's visual field, the coding scheme of the visual information along the visual pathways is poorly known. Therefore, we use probabilistic models to link the stimulation parameters to the features of the visual perceptions. Both black-box and grey-box models are considered. The grey-box models take advantage of the known neurophysiological information and are more instructive to medical doctors and psychologists.<br>
|
39 |
New Methods for Learning from Heterogeneous and Strategic AgentsDivya, Padmanabhan January 2017 (has links) (PDF)
1 Introduction
In this doctoral thesis, we address several representative problems that arise in the context of learning from multiple heterogeneous agents. These problems are relevant to many modern applications such as crowdsourcing and internet advertising. In scenarios such as crowdsourcing, there is a planner who is interested in learning a task and a set of noisy agents provide the training data for this learning task. Any learning algorithm making use of the data provided by these noisy agents must account for their noise levels. The noise levels of the agents are unknown to the planner, leading to a non-trivial difficulty. Further, the agents are heterogeneous as they differ in terms of their noise levels. A key challenge in such settings is to learn the noise levels of the agents while simultaneously learning the underlying model. Another challenge arises when the agents are strategic. For example, when the agents are required to perform a task, they could be strategic on the efforts they put in. As another example, when required to report their costs incurred towards performing the task, the agents could be strategic and may not report the costs truthfully. In general, the performance of the learning algorithms could be severely affected if the information elicited from the agents is incorrect. We address the above challenges that arise in the following representative learning problems.
Multi-label Classification from Heterogeneous Noisy Agents Multi-label classification is a well-known supervised machine learning problem where each instance is associated with multiple classes. Since several labels can be assigned to a single instance, one of the key challenges in this problem is to learn the correlations between the classes. We first assume labels from a perfect source and propose a novel topic model called Multi-Label Presence-Absence Latent Dirichlet Allocation (ML-PA-LDA). In the current day scenario, a natural source for procuring the training dataset is through mining user-generated content or directly through users in a crowdsourcing platform. In the more practical scenario of crowdsourcing, an additional challenge arises as the labels of the training instances are provided by noisy, heterogeneous crowd-workers with unknown qualities. With this as the motivation, we further adapt our topic model to the scenario where the labels are provided by multiple noisy sources and refer to this model as ML-PA-LDA-MNS (ML-PA-LDA with Multiple Noisy Sources). With experiments on standard datasets, we show that the proposed models achieve superior performance over existing methods.
Active Linear Regression with Heterogeneous, Noisy and Strategic Agents
In this work, we study the problem of training a linear regression model by procuring labels from multiple noisy agents or crowd annotators, under a budget constraint. We propose a Bayesian model for linear regression from multiple noisy sources and use variational inference for parameter estimation. When labels are sought from agents, it is important to minimize the number of labels procured as every call to an agent incurs a cost. Towards this, we adopt an active learning approach. In this specific context, we prove the equivalence of well-studied criteria of active learning such as entropy minimization and expected error reduction. For the purpose of annotator selection in active learning, we observe a useful connection with the multi-armed bandit framework. Due to the nature of the distribution of the rewards on the arms, we resort to the Robust Upper Confidence Bound (UCB) scheme with truncated empirical mean estimator to solve the annotator selection problem. This yields provable guarantees on the regret. We apply our model to the scenario where annotators are strategic and design suitable incentives to induce them to put in their best efforts.
Ranking with Heterogeneous Strategic Agents
We look at the problem where a planner must rank multiple strategic agents, a problem that has many applications including sponsored search auctions (SSA). Stochastic multi-armed bandit (MAB) mechanisms have been used in the literature to solve this problem. Existing stochastic MAB mechanisms with a deterministic payment rule, proposed in the literature, necessarily suffer a regret of (T 2=3), where T is the number of time steps. This happens because these mechanisms address the worst case scenario where the means of the agents’ stochastic rewards are separated by a very small amount that depends on T . We however take a detour and allow the planner to indicate the resolution, , with which the agents must be distinguished. This immediately leads us to introduce the notion of -Regret. We propose a dominant strategy incentive compatible (DSIC) and individually rational (IR), deterministic MAB mechanism, based on ideas from the Upper Confidence Bound (UCB) family of MAB algorithms. The proposed mechanism - UCB achieves a -regret of O(log T ). We first establish the results for single slot SSA and then non-trivially extend the results to the case of multi-slot SSA.
|
40 |
Apprentissage statistique pour la personnalisation de modèles cardiaques à partir de données d’imagerie / Statistical learning for image-based personalization of cardiac modelsLe Folgoc, Loïc 27 November 2015 (has links)
Cette thèse porte sur un problème de calibration d'un modèle électromécanique de cœur, personnalisé à partir de données d'imagerie médicale 3D+t ; et sur celui - en amont - de suivi du mouvement cardiaque. A cette fin, nous adoptons une méthodologie fondée sur l'apprentissage statistique. Pour la calibration du modèle mécanique, nous introduisons une méthode efficace mêlant apprentissage automatique et une description statistique originale du mouvement cardiaque utilisant la représentation des courants 3D+t. Notre approche repose sur la construction d'un modèle statistique réduit reliant l'espace des paramètres mécaniques à celui du mouvement cardiaque. L'extraction du mouvement à partir d'images médicales avec quantification d'incertitude apparaît essentielle pour cette calibration, et constitue l'objet de la seconde partie de cette thèse. Plus généralement, nous développons un modèle bayésien parcimonieux pour le problème de recalage d'images médicales. Notre contribution est triple et porte sur un modèle étendu de similarité entre images, sur l'ajustement automatique des paramètres du recalage et sur la quantification de l'incertitude. Nous proposons une technique rapide d'inférence gloutonne, applicable à des données cliniques 4D. Enfin, nous nous intéressons de plus près à la qualité des estimations d'incertitude fournies par le modèle. Nous comparons les prédictions du schéma d'inférence gloutonne avec celles données par une procédure d'inférence fidèle au modèle, que nous développons sur la base de techniques MCMC. Nous approfondissons les propriétés théoriques et empiriques du modèle bayésien parcimonieux et des deux schémas d'inférence / This thesis focuses on the calibration of an electromechanical model of the heart from patient-specific, image-based data; and on the related task of extracting the cardiac motion from 4D images. Long-term perspectives for personalized computer simulation of the cardiac function include aid to the diagnosis, aid to the planning of therapy and prevention of risks. To this end, we explore tools and possibilities offered by statistical learning. To personalize cardiac mechanics, we introduce an efficient framework coupling machine learning and an original statistical representation of shape & motion based on 3D+t currents. The method relies on a reduced mapping between the space of mechanical parameters and the space of cardiac motion. The second focus of the thesis is on cardiac motion tracking, a key processing step in the calibration pipeline, with an emphasis on quantification of uncertainty. We develop a generic sparse Bayesian model of image registration with three main contributions: an extended image similarity term, the automated tuning of registration parameters and uncertainty quantification. We propose an approximate inference scheme that is tractable on 4D clinical data. Finally, we wish to evaluate the quality of uncertainty estimates returned by the approximate inference scheme. We compare the predictions of the approximate scheme with those of an inference scheme developed on the grounds of reversible jump MCMC. We provide more insight into the theoretical properties of the sparse structured Bayesian model and into the empirical behaviour of both inference schemes
|
Page generated in 0.0571 seconds