Spelling suggestions: "subject:"inference"" "subject:"lnference""
201 |
Message Passing Algorithms for Facility Location ProblemsLazic, Nevena 09 June 2011 (has links)
Discrete location analysis is one of the most widely studied branches of operations research, whose applications arise in a wide variety of settings. This thesis describes a powerful new approach to facility location problems - that of message passing inference in probabilistic graphical models. Using this framework, we develop new heuristic algorithms, as well as a new approximation algorithm for a particular problem type.
In machine learning applications, facility location can be seen a discrete formulation of clustering and mixture modeling problems. We apply the developed algorithms to such problems in computer vision. We tackle the problem of motion segmentation in video sequences by formulating it as a facility location instance and demonstrate the advantages of message passing algorithms over current segmentation methods.
|
202 |
Assessing the effects of societal injury control interventionsBonander, Carl January 2016 (has links)
Injuries have emerged as one of the biggest public health issues of the 21th century. Yet, the causal effects of injury control strategies are often questioned due to a lack of randomized experiments. In this thesis, a set of quasi-experimental methods are applied and discussed in the light of causal inference theory and the type of data commonly available in injury surveillance systems. I begin by defining the interrupted time series design as a special case of the regression-discontinuity design, and the method is applied to two empirical cases. The first is a ban on the sale and production of non-reduced ignition propensity (RIP) cigarettes, and the second is a tightening of the licensing rules for mopeds. A two-way fixed effects model is then applied to a case with time-varying starting dates, attempting to identify the causal effects of municipality-provided home help services for the elderly. Lastly, the effect of the Swedish bicycle helmet law is evaluated using the comparative interrupted time series and synthetic control methods. The results from the empirical studies suggest that the stricter licensing rules and the bicycle helmet law were effective in reducing injury rates, while the home help services and RIP cigarette interventions have had limited or no impact on safety as measured by fatalities and hospital admissions. I conclude that identification of the impact of injury control interventions is possible using low cost means. However, the ability to infer causality varies greatly by empirical case and method, which highlights the important role of causal inference theory in applied intervention research. While existing methods can be used with data from injury surveillance systems, additional improvements and development of new estimators specifically tailored for injury data will likely further enhance the ability to draw causal conclusions in natural settings. Implications for future research and recommendations for practice are also discussed. / Injuries have emerged as one of the biggest public health issues of the 21th century. Yet, the causal effects of injury control strategies are rarely known due to a lack of randomized experiments. In this thesis, a set of quasi-experimental methods are discussed in the light of causal inference theory and the type of data commonly available in injury surveillance systems. I begin by defining the identifying assumptions of the interrupted time series design as a special case of the regression-discontinuity design, and the method is applied to two empirical cases. The first is a ban on the sale and production of non-fire safe cigarettes and the second is a tightening of the licensing rules for mopeds. A fixed effects panel regression analysis is then applied to a case with time-varying starting dates, attempting to identify the causal effects of municipality-provided home help services for the elderly. Lastly, the causal effect of the Swedish bicycle helmet law is evaluated using a comparative interrupted time series design and a synthetic control design. I conclude that credible identification of the impact of injury control interventions is possible using simple and cost-effective means. Implications for future research and recommendations for practice are discussed.
|
203 |
INFORMATION THEORETIC APPROACHES TOWARDS REGULATORY NETWORK INFERENCEChaitankar, Vijender 12 December 2012 (has links)
In spite of many efforts in the past, inference or reverse engineering of regulatory networks from microarray data remains an unsolved problem in the area of systems biology. Such regulatory networks play a critical role in cellular function and organization and are of interest in the study of a variety of disease areas and ecotoxicology to name a few. This dissertation proposes information theoretic methods/algorithms for inferring regulatory networks from microarray data. Most of the algorithms proposed in this dissertation can be implemented both on time series and multifactorial microarray data sets. The work proposed here infers regulatory networks considering the following six factors: (i) computational efficiency to infer genome-scale networks, (ii) incorporation of prior biological knowledge, (iii) choosing the optimal network that minimizes the joint network entropy, (iv) impact of higher order structures (specifically 3-node structures) on network inference (v) effects of the time sensitivity of regulatory interactions and (vi) exploiting the benefits of existing/proposed metrics and algorithms for reverse engineering using the concept of consensus of consensus networks. Specifically, this dissertation presents an approach towards incorporating knock-out data sets. The proposed method for incorporating knock-out data sets is flexible so that it can be easily adapted in existing/new approaches. While most of the information theoretic approaches infer networks based on pair-wise interactions this dissertation discusses inference methods that consider scoring edges from complex structures. A new inference method for building consensus networks based on networks inferred by multiple popular information theoretic approaches is also proposed here. For time-series datasets, new information theoretic metrics were proposed considering the time-lags of regulatory interactions estimated from microarray datasets. Finally, based on the scores predicted for each possible edge in the network, a probabilistic minimum description length based approach was proposed to identify the optimal network (minimizing the joint network entropy). Comparison analysis on in-silico and/or real time data sets have shown that the proposed algorithms achieve better inference accuracy and/or higher computational efficiency as compared with other state-of-the-art schemes such as ARACNE, CLR and Relevance Networks. Most of the methods proposed in this dissertation are generalized and can be easily incorporated into new methods/algorithms for network inference.
|
204 |
Inferring Gene Regulatory Networks from Expression Data using Ensemble MethodsSlawek, Janusz 01 May 2014 (has links)
High-throughput technologies for measuring gene expression made inferring of the genome-wide Gene Regulatory Networks an active field of research. Reverse-engineering of systems of transcriptional regulations became an important challenge in molecular and computational biology. Because such systems model dependencies between genes, they are important in understanding of cell behavior, and can potentially turn observed expression data into the new biological knowledge and practical applications. In this dissertation we introduce a set of algorithms, which infer networks of transcriptional regulations from variety of expression profiles with superior accuracy compared to the state-of-the-art techniques. The proposed methods make use of ensembles of trees, which became popular in many scientific fields, including genetics and bioinformatics. However, originally they were motivated from the perspective of classification, regression, and feature selection theory. In this study we exploit their relative variable importance measure as an indication of the presence or absence of a regulatory interaction between genes. We further analyze their predictions on a set of the universally recognized benchmark expression data sets, and achieve favorable results in compare with the state-of-the-art algorithms.
|
205 |
Reconstructions phylogénétiques du genre Quercus à partir de séquences du génome nucléaire et chloroplastique / Phylogeographic reconstructions of the genus Quercus based on nuclear and chloroplastic DNA sequencesHubert, François 21 June 2013 (has links)
Le genre Quercus comprend plus de 500 espèces et est réparti sur l’ensemble de l’hémisphère nord. La phylogénie du genre, faite à ce jour à partir d’un nombre très limité de marqueurs nucléaires, n’était pas résolue. Des incertitudes demeuraient au niveau des nœuds profonds où ont divergé les principaux groupes taxonomiques aujourd’hui reconnus. L’objectif de cette thèse était d’explorer de manière plus exhaustive les ressources génomiques nucléaires et chloroplastiques pour affiner la phylogénie du genre. Les travaux sont basés sur les séquences de six gènes nucléaires et de l’ensemble du génome chloroplastique. Ces travaux confirment le caractère diffus du signal phylogénétique et le gain de résolution obtenu par l’adjonction de séquences nouvelles. Ils confirment également la subdivision du genre en six groupes infragénériques (Cyclobalanopsis, Ilex, Cerris, Lobatae, Quercus s.s. et Protobalanus), dont les relations phylogénétiques ont été précisées, même si certaines irrésolutions persistent. La thèse met très clairement en évidence l’empreinte phylogéographique dans le génome chloroplastique au niveau du genre et de sa distribution mondiale. Le signal phylogéographique chloroplastique ajouté à la phylogénie nucléaire permet d’échafauder un scénario biogéographique de diversification du genre. Ce scénario devra être corroboré par des apports d’autres disciplines (paléontologie et géologie historique). / The genus Quercus comprises more than 500 species, and is widely distributed across the Northern hemisphere. Phylogenetic reconstructions based on traditional molecular sequences were so far irresolutive at the deeper nodes where the major extant taxonomic groups have diverged. This thesis aims at improving the phylogeny of the genus by exploring the current nuclear and chloroplastic genomic resources. The phylogenetic investigations are based on sequences of six nuclear genes and the entire chloroplastic genome. The results confirm that the phylogenetic signal is rather diluted and that substantial improvements can be obtained by adding sequences from additional genes. They also confirm that the genus can be subdivided in six infrageneric groups (Cyclobalanopsis, Ilex, cerris, Lobatae, Quercus s.s. et Protobalanus). Phylogenetic relationships among these groups are refined, although not fully clarified. There is a very clear phylogeographic imprint in the chloroplast genome that extends at the macroevolutionary level at the whole genus across its entire distribution. The phylogeographic structure together with the phylogeny at the nuclear level allows to elaborate an historical scenario of the radiation of the genus. Additional elements coming from other disciplines (paleontology, historical geology) are however necessary to confirm this scenario.
|
206 |
Input-output transformations in the awake mouse brain using whole-cell recordings and probabilistic analysisPuggioni, Paolo January 2015 (has links)
The activity of cortical neurons in awake brains changes dynamically as a function of the behavioural and attentional state. The primary motor cortex (M1) plays a central role in regulating complex motor behaviors. Despite a growing knowledge on its connectivity and spiking pattern, little is known about intra-cellular mechanism and rhythms underlying motor-command generation. In the last decade, whole-cell recordings in awake animals has become a powerful tool for characterising both sub-and supra-threshold activity during behaviour. Seminal in vivo studies have shown that changes in input structure and sub-threshold regime determine spike output during behaviour (input-output transformations). In this thesis I make use of computational and experimental techniques to better understand (i) how the brain regulates the sub-threshold activity of the neurons during movement and (ii) how this reflects in their input-output transformation properties. In the first part of this work I present a novel probabilistic technique to infer input statistics from in-vivo voltage-clamp traces. This approach, based on Bayesian belief networks, outperforms current methods and allows an estimation of synaptic input (i) kinetic properties, (ii) frequency, and (iii) weight distribution. I first validate the model on simulated data, thus I apply it to voltage-clamp recordings of cerebellar interneurons in awake mice. I found that synaptic weight distributions have long tails, which on average do not change during movement. Interestingly, the increase in synaptic current observed during movement is a consequence of the increase in input frequency only. In the second part, I study how the brain regulates the activity of pyramidal neurons in the M1 of awake mice during movement. I performed whole-cell recordings of pyramidal neurons in layer 5B (L5B), which represent one of the main descending output channels from motor cortex. I found that slow large-amplitude membrane potential fluctuations, typical of quiet periods, were suppressed in all L5B pyramidal neurons during movement, which by itself reduced membrane potential (Vm) variability, input sensitivity and output firing rates. However, a sub-population of L5B neurons ( 50%) concurrently experienced an increase in excitatory drive that depolarized mean Vm, enhanced input sensitivity and elevated firing rates. Thus, movement-related bidirectional modulation in L5B neurons is mediated by two opposing mechanisms: 1) a global reduction in network driven Vm variability and 2) a coincident, targeted increase in excitatory drive to a subpopulation of L5B neurons.
|
207 |
Méthodes spectrales pour l'inférence grammaticale probabiliste de langages stochastiques rationnelsBailly, Raphael 12 December 2011 (has links)
Nous nous plaçons dans le cadre de l’inférence grammaticale probabiliste. Il s’agit, étant donnée une distribution p sur un ensemble de chaînes S∗ inconnue, d’inférer un modèle probabiliste pour p à partir d’un échantillon fini S d’observations supposé i.i.d. selon p. L’inférence gram- maticale se concentre avant tout sur la structure du modèle, et la convergence de l’estimation des paramètres. Les modèles probabilistes dont il sera question ici sont les automates pondérés, ou WA. Les fonctions qu’ils modélisent sont appelées séries rationnelles. Dans un premier temps, nous étudierons la possibilité de trouver un critère de convergence absolue pour de telles séries. Par la suite, nous introduirons un type d’algorithme pour l’inférence de distributions rationnelles (i.e. distributions modélisées par un WA), basé sur des méthodes spectrales. Nous montrerons comment adapter cet algorithme pour l’appliquer au domaine, assez proche, des distributions sur les arbres. Enfin, nous tenterons d’utiliser cet algorithme d’inférence dans un contexte plus statistique d’estimation de densité. / Our framework is the probabilistic grammatical inference. That is, given an unknown distribution p on a set of string S∗ , to infer a probabilistic model for p from a sample S of observations assumed to be i.i.d. according to p. Grammatical inference focuses primarily on the structure of the probabilistic model, and the convergence of parameter estimate. Probabilistic models which will be considered here are weighted automata, or WA. The series they model are called rational series. Initially, we study the possibility of finding an absolute convergence criterion for such series. Subsequently, we introduce a algorithm for the inference of rational distrbutions (i.e. distributions modeled by WA), based on spectral methods. We will show how to fit this algorithm to the domain, fairly close, of rational distributions on trees. Finally, we will try to see how to use the spectral algorithm in a more statistical way, in a density estimation task.
|
208 |
Generalized Maximally Selected StatisticsHothorn, Torsten, Zeileis, Achim January 2007 (has links) (PDF)
Maximally selected statistics for the estimation of simple cutpoint models are embedded into a generalized conceptual framework based on conditional inference procedures. This powerful framework contains most of the published procedures in this area as special cases, such as maximally selected chi-squared and rank statistics, but also allows for direct construction of new test procedures for less standard test problems. As an application, a novel maximally selected rank statistic is derived from this framework for a censored response partitioned with respect to two ordered categorical covariates and potential interactions. This new test is employed to search for a high-risk group of rectal cancer patients treated with a neo-adjuvant chemoradiotherapy. Moreover, a new efficient algorithm for the evaluation of the asymptotic distribution for a large class of maximally selected statistics is given enabling the fast evaluation of a large number of cutpoints. / Series: Research Report Series / Department of Statistics and Mathematics
|
209 |
Semantic Assistance for Data Utilization and CurationBecker, Brian J 06 August 2013 (has links)
We propose that most data stores for large organizations are ill-designed for the future, due to limited searchability of the databases. The study of the Semantic Web has been an emerging technology since first proposed by Berners-Lee. New vocabularies have emerged, such as FOAF, Dublin Core, and PROV-O ontologies. These vocabularies, combined, can relate people, places, things, and events. Technologies developed for the Semantic Web, namely the standardized vocabularies for expressing metadata, will make data easier to utilize. We gathered use cases for various data sources, from human resources to big enterprise. Most of our use cases reflect real-world data. We developed a software package for transforming data into these semantic vocabularies, and developed a method of querying via graphical constructs. The development and testing proved itself to be useful. We conclude that data can be preserved or revived through the use of the metadata techniques for the Semantic Web.
|
210 |
On conjugate families and Jeffreys priors for von Mises-Fisher distributionsHornik, Kurt, Grün, Bettina January 2013 (has links) (PDF)
This paper discusses characteristics of standard conjugate priors and their induced
posteriors in Bayesian inference for von Mises-Fisher distributions, using either the
canonical natural exponential family or the more commonly employed polar coordinate
parameterizations. We analyze when standard conjugate priors as well as posteriors are
proper, and investigate the Jeffreys prior for the von Mises-Fisher family. Finally, we
characterize the proper distributions in the standard conjugate family of the (matrixvalued)
von Mises-Fisher distributions on Stiefel manifolds.
|
Page generated in 0.0729 seconds