Global ETD Search

121	Apprentissage actif pour l'approximation de variétés / Active learning for variety approximation Gandar, Benoît 27 November 2012 (has links) L’apprentissage statistique cherche à modéliser un lien fonctionnel entre deux variables X et Y à partir d’un échantillon aléatoire de réalisations de (X,Y ). Lorsque la variable Y prend un nombre binaire de valeurs, l’apprentissage s’appelle la classification (ou discrimination en français) et apprendre le lien fonctionnel s’apparente à apprendre la frontière d’une variété dans l’espace de la variable X. Dans cette thèse, nous nous plaçons dans le contexte de l’apprentissage actif, i.e. nous supposons que l’échantillon d’apprentissage n’est plus aléatoire et que nous pouvons, par l’intermédiaire d’un oracle, générer les points sur lesquels l’apprentissage de la variété va s’effectuer. Dans le cas où la variable Y est continue (régression), des travaux précédents montrent que le critère de la faible discrépance pour générer les premiers points d’apprentissage est adéquat. Nous montrons, de manière surprenante, que ces résultats ne peuvent pas être transférés à la classification. Dans ce manuscrit, nous proposons alors le critère de la dispersion pour la classification. Ce critère étant difficile à mettre en pratique, nous proposons un nouvel algorithme pour générer un plan d’expérience à faible dispersion dans le carré unité. Après une première approximation de la variété, des approximations successives peuvent être réalisées afin d’affiner la connaissance de celle-ci. Deux méthodes d’échantillonnage sont alors envisageables : le « selective sampling » qui choisit les points à présenter à un oracle parmi un ensemble fini de candidats et l’« adaptative sampling » qui permet de choisir n’importe quels points de l’espace de la variable X. Le deuxième échantillonnage peut être vu comme un passage à la limite du premier. Néanmoins, en pratique, il n’est pas raisonnable d’utiliser cette méthode. Nous proposons alors un nouvel algorithme basé sur le critère de dispersion, menant de front exploitation et exploration, pour approximer une variété. / Statistical learning aims to modelize a functional link between two variables X and Y thanks to a random sample of realizations of the couple (X,Y ). When the variable Y takes a binary number of values, learning is named classification and learn the functional link is equivalent to learn the boundary of a manifold in the feature space of the variable X. In this PhD thesis, we are placed in the context of active learning, i.e. we suppose that learning sample is not random and that we can, thanks to an oracle, generate points for learning the manifold. In the case where the variable Y is continue (regression), previous works show that criterion of low discrepacy to generate learning points is adequat. We show that, surprisingly, this result cannot be transfered to classification talks. In this PhD thesis, we propose the criterion of dispersion for classification problems. This criterion being difficult to realize, we propose a new algorithm to generate low dispersion samples in the unit cube. After a first approximation of the manifold, successive approximations can be realized in order to refine its knowledge. Two methods of sampling are possible : the « selective sampling » which selects points to present to the oracle in a finite set of candidate points, and the « adaptative sampling » which allows to select any point in the feature space of the variable X. The second sampling can be viewed as the infinite limit of the first. Nevertheless, in practice, it is not reasonable to use this method. Then, we propose a new algorithm, based on dispersion criterion, leading both exploration and exploitation to approximate a manifold. Apprentissage statistique Apprentissage actif Échantillonnage sélectif Plans d’expériences Approximation de variétés Discrépance Dispersion Maximin Minimax Statistical learning Active learning Blind active learning Selective active learning Experimental design Manifolds approximation Discrepancy Dispersion Maximin Minimax
122	Estimation non-paramétrique du quantile conditionnel et apprentissage semi-paramétrique : applications en assurance et actuariat / Nonparametric estimation of conditional quantile and semi-parametric learning : applications on insurance and actuarial data Knefati, Muhammad Anas 19 November 2015 (has links) La thèse se compose de deux parties : une partie consacrée à l'estimation des quantiles conditionnels et une autre à l'apprentissage supervisé. La partie "Estimation des quantiles conditionnels" est organisée en 3 chapitres : Le chapitre 1 est consacré à une introduction sur la régression linéaire locale, présentant les méthodes les plus utilisées, pour estimer le paramètre de lissage. Le chapitre 2 traite des méthodes existantes d’estimation nonparamétriques du quantile conditionnel ; Ces méthodes sont comparées, au moyen d’expériences numériques sur des données simulées et des données réelles. Le chapitre 3 est consacré à un nouvel estimateur du quantile conditionnel et que nous proposons ; Cet estimateur repose sur l'utilisation d'un noyau asymétrique en x. Sous certaines hypothèses, notre estimateur s'avère plus performant que les estimateurs usuels.<br> La partie "Apprentissage supervisé" est, elle aussi, composée de 3 chapitres : Le chapitre 4 est une introduction à l’apprentissage statistique et les notions de base utilisées, dans cette partie. Le chapitre 5 est une revue des méthodes conventionnelles de classification supervisée. Le chapitre 6 est consacré au transfert d'un modèle d'apprentissage semi-paramétrique. La performance de cette méthode est montrée par des expériences numériques sur des données morphométriques et des données de credit-scoring. / The thesis consists of two parts: One part is about the estimation of conditional quantiles and the other is about supervised learning. The "conditional quantile estimate" part is organized into 3 chapters. Chapter 1 is devoted to an introduction to the local linear regression and then goes on to present the methods, the most used in the literature to estimate the smoothing parameter. Chapter 2 addresses the nonparametric estimation methods of conditional quantile and then gives numerical experiments on simulated data and real data. Chapter 3 is devoted to a new conditional quantile estimator, we propose. This estimator is based on the use of asymmetrical kernels w.r.t. x. We show, under some hypothesis, that this new estimator is more efficient than the other estimators already used.<br> The "supervised learning" part is, too, with 3 chapters: Chapter 4 provides an introduction to statistical learning, remembering the basic concepts used in this part. Chapter 5 discusses the conventional methods of supervised classification. Chapter 6 is devoted to propose a method of transferring a semiparametric model. The performance of this method is shown by numerical experiments on morphometric data and credit-scoring data. Régression non-Paramétrique Quantile Paramètre de lissage Apprentissage statistique Classification supervisée Modèles à score unique Mean regression Quantile Smoothing parameter Statistical learning Supervised classification Semi parametric single index models 519.54
123	Large-scale functional MRI analysis to accumulate knowledge on brain functions / Analyse à grande échelle d'IRM fonctionnelle pour accumuler la connaissance sur les fonctions cérébrales Schwartz, Yannick 21 April 2015 (has links) Comment peut-on accumuler de la connaissance sur les fonctions cérébrales ? Comment peut-on bénéficier d'années de recherche en IRM fonctionnelle (IRMf) pour analyser des processus cognitifs plus fins et construire un modèle exhaustif du cerveau ? Les chercheurs se basent habituellement sur des études individuelles pour identifier des régions cérébrales recrutées par les processus cognitifs. La comparaison avec l'historique du domaine se fait généralement manuellement pas le biais de la littérature, qui permet de définir des régions d'intérêt dans le cerveau. Les méta-analyses permettent de définir des méthodes plus formelles et automatisables pour analyser la littérature. Cette thèse examine trois manières d'accumuler et d'organiser les connaissances sur le fonctionnement du cerveau en utilisant des cartes d'activation cérébrales d'un grand nombre d'études. Premièrement, nous présentons une approche qui utilise conjointement deux expériences d'IRMf similaires pour mieux conditionner une analyse statistique. Nous montrons que cette méthode est une alternative intéressante par rapport aux analyses qui utilisent des régions d'intérêts, mais demande cependant un travail manuel dans la sélection des études qui l'empêche de monter à l'échelle. A cause de la difficulté à sélectionner automatiquement les études, notre deuxième contribution se focalise sur l'analyse d'une unique étude présentant un grand nombre de conditions expérimentales. Cette méthode estime des réseaux fonctionnels (ensemble de régions cérébrales) et les associe à des profils fonctionnels (ensemble pondéré de descripteurs cognitifs). Les limitations de cette approche viennent du fait que nous n'utilisons qu'une seule étude, et qu'elle se base sur un modèle non supervisé qui est par conséquent plus difficile à valider. Ce travail nous a cependant apporté la notion de labels cognitifs, qui est centrale pour notre dernière contribution. Cette dernière contribution présente une méthode qui a pour objectif d'apprendre des atlas fonctionnels en combinant plusieurs jeux de données. [Henson2006] montre qu'une inférence directe, c.a.d. la probabilité d'une activation étant donné un processus cognitif, n'est souvent pas suffisante pour conclure sur l'engagement de régions cérébrales pour le processus cognitif en question. Réciproquement, [Poldrack 2006] présente l'inférence inverse qui est la probabilité qu'un processus cognitif soit impliqué étant donné qu'une région cérébrale est activée, et décrit le risque de raisonnements fallacieux qui peuvent en découler. Pour éviter ces problèmes, il ne faut utiliser l'inférence inverse que dans un contexte où l'on suffisamment bien échantillonné l'espace cognitif pour pouvoir faire une inférence pertinente. Nous présentons une méthode qui utilise un « meta-design » pour décrire des tâches cognitives avec un vocabulaire commun, et qui combine les inférences directe et inverse pour mettre en évidence des réseaux fonctionnels qui sont cohérents à travers les études. Nous utilisons un modèle prédictif pour l'inférence inverse, et effectuons les prédictions sur de nouvelles études pour s'assurer que la méthode n'apprend pas certaines idiosyncrasies des données d'entrées. Cette dernière contribution nous a permis d'apprendre des réseaux fonctionnels, et de les associer avec des concepts cognitifs. Nous avons exploré différentes approches pour analyser conjointement des études d'IRMf. L'une des difficultés principales était de trouver un cadre commun qui permette d'analyser ensemble ces études malgré leur diversité. Ce cadre s'est instancié sous la forme d'un vocabulaire commun pour décrire les tâches d'IRMf. et a permis d'établir un modèle statistique du cerveau à grande échelle et d'accumuler des connaissances à travers des études d'IRM fonctionnelle. / How can we accumulate knowledge on brain functions? How can we leverage years of research in functional MRI to analyse finer-grained psychological constructs, and build a comprehensive model of the brain? Researchers usually rely on single studies to delineate brain regions recruited by mental processes. They relate their findings to previous works in an informal way by defining regions of interest from the literature. Meta-analysis approaches provide a more principled way to build upon the literature. This thesis investigates three ways to assemble knowledge using activation maps from a large amount of studies. First, we present an approach that uses jointly two similar fMRI experiments, to better condition an analysis from a statistical standpoint. We show that it is a valuable data-driven alternative to traditional regions of interest analyses, but fails to provide a systematic way to relate studies, and thus does not permit to integrate knowledge on a large scale. Because of the difficulty to associate multiple studies, we resort to using a single dataset sampling a large number of stimuli for our second contribution. This method estimates functional networks associated with functional profiles, where the functional networks are interacting brain regions and the functional profiles are a weighted set of cognitive descriptors. This work successfully yields known brain networks and automatically associates meaningful descriptions. Its limitations lie in the unsupervised nature of this method, which is more difficult to validate, and the use of a single dataset. It however brings the notion of cognitive labels, which is central to our last contribution. Our last contribution presents a method that learns functional atlases by combining several datasets. [Henson 2006] shows that forward inference, i.e. the probability of an activation given a cognitive process, is often not sufficient to conclude on the engagement of brain regions for a cognitive process. Conversely, [Poldrack 2006] describes reverse inference as the probability of a cognitive process given an activation, but warns of a logical fallacy in concluding on such inference from evoked activity. Avoiding this issue requires to perform reverse inference with a large coverage of the cognitive space. We present a framework that uses a "meta-design" to describe many different tasks with a common vocabulary, and use forward and reverse inference in conjunction to outline functional networks that are consistently represented across the studies. We use a predictive model for reverse inference, and perform prediction on unseen studies to guarantee that we do not learn studies' idiosyncrasies. This final contribution permits to learn functional atlases, i.e. functional networks associated with a cognitive concept. We explored different possibilities to jointly analyse multiple fMRI experiments. We have found that one of the main challenges is to be able to relate the experiments with one another. As a solution, we propose a common vocabulary to describe the tasks. [Henson 2006] advocates the use of forward and reverse inference in conjunction to associate cognitive functions to brain regions, which is only possible in the context of a large scale analysis to overcome the limitations of reverse inference. This framing of the problem therefore makes it possible to establish a large statistical model of the brain, and accumulate knowledge across functional neuroimaging studies. Neuroimagerie Inférence directe Inférence inverse Apprentissage statistique Gestion de données Neuroimaging Forward inference Reverse inference Statistical learning Data management
124	Bank Customer Churn Prediction : A comparison between classification and evaluation methods Tandan, Isabelle, Goteman, Erika January 2020 (has links) This study aims to assess which supervised statistical learning method; random forest, logistic regression or K-nearest neighbor, that is the best at predicting banks customer churn. Additionally, the study evaluates which cross-validation set approach; k-Fold cross-validation or leave-one-out cross-validation that yields the most reliable results. Predicting customer churn has increased in popularity since new technology, regulation and changed demand has led to an increase in competition for banks. Thus, with greater reason, banks acknowledge the importance of maintaining their customer base. The findings of this study are that unrestricted random forest model estimated using k-Fold is to prefer out of performance measurements, computational efficiency and a theoretical point of view. Albeit, k-Fold cross-validation and leave-one-out cross-validation yield similar results, k-Fold cross-validation is to prefer due to computational advantages. For future research, methods that generate models with both good interpretability and high predictability would be beneficial. In order to combine the knowledge of which customers end their engagement as well as understanding why. Moreover, interesting future research would be to analyze at which dataset size leave-one-out cross-validation and k-Fold cross-validation yield the same results. machine learning cross-validation k-fold leave-one-out random forest decision trees k-nearest neighbor logistic regression supervised learning supervised statistical learning binary classification customer churn bank customer churn. Probability Theory and Statistics Sannolikhetsteori och statistik
125	Modélisation statistique de l’état de charge des batteries électriques / Statistical modeling of the state of charge of electric batteries Kalawoun, Jana 30 November 2015 (has links) Les batteries électriques sont omniprésentes dans notre vie quotidienne : ordinateur, téléphone, etc. Elles jouent un rôle important dans le défi de la transition énergétique : anticiper la raréfaction des énergies fossiles et réduire la pollution, en développant le stockage des énergies renouvelables et les transports électriques. Cependant, l'estimation de l'état de charge (State of Charge – SoC) d'une batterie est difficile et les modèles de prédiction actuels sont peu robustes. En effet, une batterie est un système électrochimique complexe, dont la dynamique est influencée non seulement par ses caractéristiques internes, mais aussi par les conditions d'usages souvent non contrôlables : température, profil d’utilisation, etc. Or, une estimation précise du SoC permet de garantir une utilisation sûre de la batterie en évitant une surcharge ou surdécharge ; mais aussi d’estimer son autonomie. Dans cette étude, nous utilisons un modèle à espaces d'états gouverné par une chaîne de Markov cachée. Ce modèle est fondé sur des équations physiques et la chaîne de Markov cachée permet d’appréhender les différents «régimes de fonctionnement» de la batterie. Pour garantir l’unicité des paramètres du modèle, nous démontrons son identifiabilité à partir de contraintes simples et naturelles sur ses paramètres «physiques ». L’estimation du SoC dans un véhicule électrique doit être faîte en ligne et avec une puissance de calcul limitée. Nous estimons donc le SoC en utilisant une technique d’échantillonnage préférentiel séquentiel. D’autre part l’estimation des paramètres est faîte à partir d’une base d’apprentissage pour laquelle les états de la chaîne de Markov et le SoC ne sont pas observés. Nous développons et testons trois algorithmes adaptés à notre modèle à structure latente : un échantillonneur particulaire de Gibbs, un algorithme de Monte-Carlo EM pénalisé par des contraintes d’identifiabilité et un algorithme de Monte-Carlo EM pénalisé par une loi a priori. Par ailleurs les états cachés de la chaîne de Markov visent à modéliser les différents régimes du fonctionnement de la batterie. Nous identifions leur nombre par divers critères de sélection de modèles. Enfin, à partir de données issues de trois types de batteries (cellule, module et pack d’un véhicule électrique), notre modèle a permis d’appréhender les différentes sollicitations de la batterie et donne des estimations robustes et précises du SoC. / Electric batteries are omnipresent in our daily lives: computers, smartphones, etc. Batteries are important for anticipating the scarcity of fossil fuels and tackling their environmental impact. Therefore, estimating the State of Charge (SoC) of a battery is nowadays a challenging issue, as existing physical and statistical models are not yet robust. Indeed a battery is a complex electrochemical system. Its dynamic depends not only on its internal characteristics but also on uncontrolled usage conditions: temperature, usage profile, etc. However the SoC estimation helps to prevent overcharge and deep discharge, and to estimate the battery autonomy. In this study, the battery dynamics are described by a set of physical linear equations, switching randomly according to a Markov chain. This model is referred to as switching Markov state space model. To ensure the unicity of the model parameters, we prove its identifiability by applying straightforward and natural constraints on its “physical” parameters. Embedded applications, like electric vehicles, impose online estimated with hardware and time constraints. Therefore we estimate the SoC using a sequential importance sampling technique. Furthermore the model includes two latent variables: the SoC and the Markov chain state. Thus, to estimate the parameters, we develop and test three algorithms adapted to latent structure models: particle Gibbs sampler, Monte Carlo EM penalized with identifiability constraints, and Monte Carlo EM penalized with a prior distribution. The hidden Markov states aim to model the different “regimes” of the battery dynamics. We identify their number using different model selection criteria. Finally, when applied to various data from three battery types (cell, module and pack of an electric vehicle) our model allows us to analyze the battery dynamics and to obtain a robust and accurate SoC estimation under uncontrolled usage conditions. Apprentissage statistique Filtrage particulaire Algorithme EM Sélection de modèle Statistical learning State of charge of an electric battery Switching Markov State Space Model Patricle filter EM algorithm Model selection
126	Essays on Sparse-Grids and Statistical-Learning Methods in Economics Valero, Rafael 07 July 2017 (has links) Compuesta por tres capítulos: El primero es un estudio sobre la implementación the Sparse Grid métodos para es el estudio de modelos económicos con muchas dimensiones. Llevado a cabo mediante aplicaciones noveles del método de Smolyak con el objetivo de favorecer la tratabilidad y obtener resultados preciso. Los resultados muestran mejoras en la eficiencia de la implementación de modelos con múltiples agentes. El segundo capítulo introduce una nueva metodología para la evaluación de políticas económicas, llamada Synthetic Control with Statistical Learning, todo ello aplicado a políticas particulares: a) reducción del número de horas laborales en Portugal en 1996 y b) reducción del coste del despido en España en 2010. La metodología funciona y se erige como alternativa a previos métodos. En términos empíricos se muestra que tras la implementación de la política se produjo una reducción efectiva del desempleo y en el caso de España un incremento del mismo. El tercer capítulo utiliza la metodología utiliza en el segundo capítulo y la aplica para evaluar la implementación del Tercer Programa Europeo para la Seguridad Vial (Third European Road Safety Action Program) entre otras metodologías. Los resultados muestran que la coordinación a nivel europeo de la seguridad vial a supuesto una ayuda complementaria. En el año 2010 se estima una reducción de víctimas mortales de entre 13900 y 19400 personal en toda Europa. Smolyak method Sparse grid Adaptive domain Projection Anisotropic grid Collocation High-dimensional problem Policy evaluation Synthetic control methods Labor Statistical learning Road safety policy European Union Synthetic control methods Policy evaluation Linear factor models Fundamentos del Análisis Económico
127	Селекција и рангирање кључних индикатора иновационог потенцијала у контексту одрживог индустријског развоја / Selekcija i rangiranje ključnih indikatora inovacionog potencijala u kontekstu održivog industrijskog razvoja / Selection and ranking the key innovation potential indicators in the context of sustainable industrial development Marković Dušan 10 October 2020 (has links) <p>Одрживи индустријски развој директно је повезан са стварањем повољних услова за спровођење иновативних активности. Главни изазов на пољу управљања иновацијама је избор и рангирање индикатора који омогућавају стварање иновација, како на нивоу државе / регије (макро нивоу), тако и на нивоу предузећа (микро нивоу).Oво истраживање је спроведено за оба нивоа одвојено. Рангирање индикатора на макро нивоу извршено је за појединачне државе чланице ЕУ и за ЕУ као јединствену регију За ту сврху примењена је метода статистичког учења. На микро нивоу, урађена је студија случаја за рангирање индикатора иновацијског потенцијала за сектор ММСП у Србији, коришћeњем методe структурираног упитника и методe вишекритеријумске анализе. Резултати истраживања пружају прилику да се укаже на значај појединих индикатора у процесу стварања иновације, како на макро тако и на микро нивоу.</p> / <p>Održivi industrijski razvoj direktno je povezan sa stvaranjem povoljnih uslova za sprovođenje inovativnih aktivnosti. Glavni izazov na polju upravljanja inovacijama je izbor i rangiranje indikatora koji omogućavaju stvaranje inovacija, kako na nivou države / regije (makro nivou), tako i na nivou preduzeća (mikro nivou).Ovo istraživanje je sprovedeno za oba nivoa odvojeno. Rangiranje indikatora na makro nivou izvršeno je za pojedinačne države članice EU i za EU kao jedinstvenu regiju Za tu svrhu primenjena je metoda statističkog učenja. Na mikro nivou, urađena je studija slučaja za rangiranje indikatora inovacijskog potencijala za sektor MMSP u Srbiji, korišćenjem metode strukturiranog upitnika i metode višekriterijumske analize. Rezultati istraživanja pružaju priliku da se ukaže na značaj pojedinih indikatora u procesu stvaranja inovacije, kako na makro tako i na mikro nivou.</p> / <p>Sustainable industrial development is directly related to the creation of favorable conditions for the implementation of innovative activities. The main challenge in the field of innovation management is the selection and ranking of indicators that enable the creation of innovation, both at the state/region level (macro level) and at the enterprise level (micro level). This research was conducted for both levels separately. The ranking of indicators at the macro level was done for individual member states of EU, and for the EU as a unique region. For this purpose, the method of statistical learning was applied. At the micro level, a case study for ranking the indicators of innovation potential was done for the MSME sector in Serbia, using the method of a structured questionnaire and the method of multi-criteria analysis. The results of the research provide an opportunity to see the importance of individual indicators in the process of creation of innovation, both at the macro and micro levels.</p>
128	Model Averaging in Large Scale Learning / Estimateur par agrégat en apprentissage statistique en grande dimension Grappin, Edwin 06 March 2018 (has links) Les travaux de cette thèse explorent les propriétés de procédures d'estimation par agrégation appliquées aux problèmes de régressions en grande dimension. Les estimateurs par agrégation à poids exponentiels bénéficient de résultats théoriques optimaux sous une approche PAC-Bayésienne. Cependant, le comportement théorique de l'agrégat avec extit{prior} de Laplace n'est guère connu. Ce dernier est l'analogue du Lasso dans le cadre pseudo-bayésien. Le Chapitre 2 explicite une borne du risque de prédiction de cet estimateur. Le Chapitre 3 prouve qu'une méthode de simulation s'appuyant sur un processus de Langevin Monte Carlo permet de choisir explicitement le nombre d'itérations nécessaire pour garantir une qualité d'approximation souhaitée. Le Chapitre 4 introduit des variantes du Lasso pour améliorer les performances de prédiction dans des contextes partiellement labélisés. / This thesis explores properties of estimations procedures related to aggregation in the problem of high-dimensional regression in a sparse setting. The exponentially weighted aggregate (EWA) is well studied in the literature. It benefits from strong results in fixed and random designs with a PAC-Bayesian approach. However, little is known about the properties of the EWA with Laplace prior. Chapter 2 analyses the statistical behaviour of the prediction loss of the EWA with Laplace prior in the fixed design setting. Sharp oracle inequalities which generalize the properties of the Lasso to a larger family of estimators are established. These results also bridge the gap from the Lasso to the Bayesian Lasso. Chapter 3 introduces an adjusted Langevin Monte Carlo sampling method that approximates the EWA with Laplace prior in an explicit finite number of iterations for any targeted accuracy. Chapter 4 explores the statisctical behaviour of adjusted versions of the Lasso for the transductive and semi-supervised learning task in the random design setting. Apprentissage statistique Régression Apprentissage automatique Estimation par agrégation PAC-Bayésien Statistical learning Regression Machine learning Estimation by aggregation PAC-Bayesian 519
129	Data-driven modeling and simulation of spatiotemporal processes with a view toward applications in biology Maddu Kondaiah, Suryanarayana 11 January 2022 (has links) Mathematical modeling and simulation has emerged as a fundamental means to understand physical process around us with countless real-world applications in applied science and engineering problems. However, heavy reliance on first principles, symmetry relations, and conservation laws has limited its applicability to a few scientific domains and even few real-world scenarios. Especially in disciplines like biology the underlying living constituents exhibit a myriad of complexities like non-linearities, non-equilibrium physics, self-organization and plasticity that routinely escape mathematical treatment based on governing laws. Meanwhile, recent decades have witnessed rapid advancement in computing hardware, sensing technologies, and algorithmic innovations in machine learning. This progress has helped propel data-driven paradigms to achieve unprecedented practical success in the fields of image processing and computer vision, natural language processing, autonomous transport, and etc. In the current thesis, we explore, apply, and advance statistical and machine learning strategies that help bridge the gap between data and mathematical models, with a view toward modeling and simulation of spatiotemporal processes in biology. As first, we address the problem of learning interpretable mathematical models of biologial process from limited and noisy data. For this, we propose a statistical learning framework called PDE-STRIDE based on the theory of stability selection and ℓ0-based sparse regularization for parsimonious model selection. The PDE-STRIDE framework enables model learning with relaxed dependencies on tuning parameters, sample-size and noise-levels. We demonstrate the practical applicability of our method on real-world data by considering a purely data-driven re-evaluation of the advective triggering hypothesis explaining the embryonic patterning event in the C. elegans zygote. As a next natural step, we extend our PDE-STRIDE framework to leverage prior knowledge from physical principles to learn biologically plausible and physically consistent models rather than models that simply fit the data best. For this, we modify the PDE-STRIDE framework to handle structured sparsity constraints for grouping features which enables us to: 1) enforce conservation laws, 2) extract spatially varying non-observables, 3) encode symmetry relations associated with the underlying biological process. We show several applications from systems biology demonstrating the claim that enforcing priors dramatically enhances the robustness and consistency of the data-driven approaches. In the following part, we apply our statistical learning framework for learning mean-field deterministic equations of active matter systems directly from stochastic self-propelled active particle simulations. We investigate two examples of particle models which differs in the microscopic interaction rules being used. First, we consider a self-propelled particle model endowed with density-dependent motility character. For the chosen hydrodynamic variables, our data-driven framework learns continuum partial differential equations that are in excellent agreement with analytical derived coarse-grain equations from Boltzmann approach. In addition, our structured sparsity framework is able to decode the hidden dependency between particle speed and the local density intrinsic to the self-propelled particle model. As a second example, the learning framework is applied for coarse-graining a popular stochastic particle model employed for studying the collective cell motion in epithelial sheets. The PDE-STRIDE framework is able to infer novel PDE model that quantitatively captures the flow statistics of the particle model in the regime of low density fluctuations. Modern microscopy techniques produce GigaBytes (GB) and TeraBytes (TB) of data while imaging spatiotemporal developmental dynamics of living organisms. However, classical statistical learning based on penalized linear regression models struggle with issues like accurate computation of derivatives in the candidate library and problems with computational scalability for application to “big” and noisy data-sets. For this reason we exploit the rich parameterization of neural networks that can efficiently learn from large data-sets. Specifically, we explore the framework of Physics-Informed Neural Networks (PINN) that allow for seamless integration of physics priors with measurement data. We propose novel strategies for multi-objective optimization that allow for adapting PINN architecture to multi-scale modeling problems arising in biology. We showcase application examples for both forward and inverse modeling of mesoscale active turbulence phenomenon observed in dense bacterial suspensions. Employing our strategies, we demonstrate orders of magnitude gain in accuracy and convergence in comparison with conventional formulation for solving multi-objective optimization in PINNs. In the concluding chapter of the thesis, we skip model interpretability and focus on learning computable models directly from noisy data for the purpose of pure dynamics forecasting. We propose STENCIL-NET, an artificial neural network architecture that learns solution adaptive spatial discretization of an unknown PDE model that can be stably integrated in time with negligible loss in accuracy. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids, and also showcase de-noising application that help decompose spatiotemporal dynamics from the noise in an equation-free manner. info:eu-repo/classification/ddc/004 ddc:004
130	Predictive Modeling and Statistical Inference for CTA returns : A Hidden Markov Approach with Sparse Logistic Regression Fransson, Oskar January 2023 (has links) This thesis focuses on predicting trends in Commodity Trading Advisors (CTAs), also known as trend-following hedge funds. The paper applies a Hidden Markov Model (HMM) for classifying trends. Additionally, by incorporating additional features, a regularized logistic regression model is used to enhance prediction capability. The model demonstrates success in identifying positive trends in CTA funds, with particular emphasis on precision and risk-adjusted return metrics. In the context of regularized regression models, techniques for statistical inference such as bootstrap resampling and Markov Chain Monte Carlo are applied to estimate the distribution of parameters. The findings suggest the model's effectiveness in predicting favorable CTA performance and mitigating equity market drawdowns. For future research, it is recommended to explore alternative classification models and extend the methodology to different markets and datasets. Probability theory Statistical inference finance CTA managed futures machine learning statistical learning stochastic process sparse logistic regression Markov Chain Monte Carlo Hidden Markov model Mathematics Matematik Probability Theory and Statistics Sannolikhetsteori och statistik

Search results