Spelling suggestions: "subject:"inference"" "subject:"lnference""
451 |
Machine Learning Methods for Microarray Data AnalysisGabbur, Prasad January 2010 (has links)
Microarrays emerged in the 1990s as a consequence of the efforts to speed up the process of drug discovery. They revolutionized molecular biological research by enabling monitoring of thousands of genes together. Typical microarray experiments measure the expression levels of a large numberof genes on very few tissue samples. The resulting sparsity of data presents major challenges to statistical methods used to perform any kind of analysis on this data. This research posits that phenotypic classification and prediction serve as good objective functions for both optimization and evaluation of microarray data analysis methods. This is because classification measures whatis needed for diagnostics and provides quantitative performance measures such as leave-one-out (LOO) or held-out prediction accuracy and confidence. Under the classification framework, various microarray data normalization procedures are evaluated using a class label hypothesis testing framework and also employing Support Vector Machines (SVM) and linear discriminant based classifiers. A novel normalization technique based on minimizing the squared correlation coefficients between expression levels of gene pairs is proposed and evaluated along with the other methods. Our results suggest that most normalization methods helped classification on the datasets considered except the rank method, most likely due to its quantization effects.Another contribution of this research is in developing machine learning methods for incorporating an independent source of information, in the form of gene annotations, to analyze microarray data. Recently, genes of many organisms have been annotated with terms from a limited vocabulary called Gene Ontologies (GO), describing the genes' roles in various biological processes, molecular functions and their locations within the cell. Novel probabilistic generative models are proposed for clustering genes using both their expression levels and GO tags. These models are similar in essence to the ones used for multimodal data, such as images and words, with learning and inference done in a Bayesian framework. The multimodal generative models are used for phenotypic class prediction. More specifically, the problems of phenotype prediction for static gene expression data and state prediction for time-course data are emphasized. Using GO tags for organisms whose genes have been studied more comprehensively leads to an improvement in prediction. Our methods also have the potential to provide a way to assess the quality of available GO tags for the genes of various model organisms.
|
452 |
Evaluating the Use of Ridge Regression and Principal Components in Propensity Score Estimators under MulticollinearityGripencrantz, Sarah January 2014 (has links)
Multicollinearity can be present in the propensity score model when estimating average treatment effects (ATEs). In this thesis, logistic ridge regression (LRR) and principal components logistic regression (PCLR) are evaluated as an alternative to ML estimation of the propensity score model. ATE estimators based on weighting (IPW), matching and stratification are assessed in a Monte Carlo simulation study to evaluate LRR and PCLR. Further, an empirical example of using LRR and PCLR on real data under multicollinearity is provided. Results from the simulation study reveal that under multicollinearity and in small samples, the use of LRR reduces bias in the matching estimator, compared to ML. In large samples PCLR yields lowest bias, and typically was found to have the lowest MSE in all estimators. PCLR matched ML in bias under IPW estimation and in some cases had lower bias. The stratification estimator was heavily biased compared to matching and IPW but both bias and MSE improved as PCLR was applied, and for some cases under LRR. The specification with PCLR in the empirical example was usually most sensitive as a strongly correlated covariate was included in the propensity score model.
|
453 |
Distributed and Higher-Order Graphical Models : towards Segmentation, Tracking, Matching and 3D Model InferenceWang, Chaohui 29 September 2011 (has links) (PDF)
This thesis is devoted to the development of graph-based methods that address several of the most fundamental computer vision problems, such as segmentation, tracking, shape matching and 3D model inference. The first contribution of this thesis is a unified, single-shot optimization framework for simultaneous segmentation, depth ordering and multi-object tracking from monocular video sequences using a pairwise Markov Random Field (MRF). This is achieved through a novel 2.5D layered model where object-level and pixel-level representations are seamlessly combined through local constraints. Towards introducing high-level knowledge, such as shape priors, we then studied the problem of non-rigid 3D surface matching. The second contribution of this thesis consists of a higher-order graph matching formulation that encodes various measurements of geometric/appearance similarities and intrinsic deformation errors. As the third contribution of this thesis, higher-order interactions were further considered to build pose-invariant statistical shape priors and were exploited for the development of a novel approach for knowledge-based 3D segmentation in medical imaging which is invariant to the global pose and the initialization of the shape model. The last contribution of this thesis aimed to partially address the influence of camera pose in visual perception. To this end, we introduced a unified paradigm for 3D landmark model inference from monocular 2D images to simultaneously determine both the optimal 3D model and the corresponding 2D projections without explicit estimation of the camera viewpoint, which is also able to deal with misdetections/occlusions
|
454 |
Necessity, possibility and the search for counterexamples in human reasoningSerpell, Sylvia Mary Parnell January 2011 (has links)
This thesis presents a series of experiments where endorsement rates, latencies and measures of cognitive ability were collected, to investigate the extent to which people search for counterexamples under necessity instructions, and alternative models under possibility instructions. The research was motivated by a syllogistic reasoning study carried out by Evans, Handley, Harper, and Johnson-Laird (1999), and predictions were derived from mental model theory (Johnson-Laird, 1983; Johnson-Laird & Byrne, 1991). With regard to the endorsement rate data: Experiment 1 failed to find evidence that a search for counterexamples or alternative models took place. In contrast experiment 2 (transitive inference) found some evidence to support the search for alternative models under possibility instructions, and following an improved training session, experiment 3 produced strong evidence to suggest that people searched for other models; which was mediated by cognitive ability. There was also strong evidence from experiments 4, 5 and 6 (abstract and everyday conditionals) to support the search for counterexamples and alternative models. Furthermore it was also found that people were more likely to find alternative causes when there were many that could be retrieved from their everyday knowledge, and that people carried out a search for counterexamples with many alternative causes under necessity instructions, and across few and many causal groups under possibility instructions. .The evidence from the latency data was limited and inconsistent, although people with higher cognitive ability were generally quicker in completing the tasks.
|
455 |
On probabilistic inference approaches to stochastic optimal controlRawlik, Konrad Cyrus January 2013 (has links)
While stochastic optimal control, together with associate formulations like Reinforcement Learning, provides a formal approach to, amongst other, motor control, it remains computationally challenging for most practical problems. This thesis is concerned with the study of relations between stochastic optimal control and probabilistic inference. Such dualities { exempli ed by the classical Kalman Duality between the Linear-Quadratic-Gaussian control problem and the filtering problem in Linear-Gaussian dynamical systems { make it possible to exploit advances made within the separate fields. In this context, the emphasis in this work lies with utilisation of approximate inference methods for the control problem. Rather then concentrating on special cases which yield analytical inference problems, we propose a novel interpretation of stochastic optimal control in the general case in terms of minimisation of certain Kullback-Leibler divergences. Although these minimisations remain analytically intractable, we show that natural relaxations of the exact dual lead to new practical approaches. We introduce two particular general iterative methods ψ-Learning, which has global convergence guarantees and provides a unifying perspective on several previously proposed algorithms, and Posterior Policy Iteration, which allows direct application of inference methods. From these, practical algorithms for Reinforcement Learning, based on a Monte Carlo approximation to ψ-Learning, and model based stochastic optimal control, using a variational approximation of posterior policy iteration, are derived. In order to overcome the inherent limitations of parametric variational approximations, we furthermore introduce a new approach for none parametric approximate stochastic optimal control based on a reproducing kernel Hilbert space embedding of the control problem. Finally, we address the general problem of temporal optimisation, i.e., joint optimisation of controls and temporal aspects, e.g., duration, of the task. Specifically, we introduce a formulation of temporal optimisation based on a generalised form of the finite horizon problem. Importantly, we show that the generalised problem has a dual finite horizon problem of the standard form, thus bringing temporal optimisation within the reach of most commonly used algorithms. Throughout, problems from the area of motor control of robotic systems are used to evaluate the proposed methods and demonstrate their practical utility.
|
456 |
Bayesian multisensory perceptionHospedales, Timothy January 2008 (has links)
A key goal for humans and artificial intelligence systems is to develop an accurate and unified picture of the outside world based on the data from any sense(s) that may be available. The availability of multiple senses presents the perceptual system with new opportunities to fulfil this goal, but exploiting these opportunities first requires the solution of two related tasks. The first is how to make the best use of any redundant information from the sensors to produce the most accurate percept of the state of the world. The second is how to interpret the relationship between observations in each modality; for example, the correspondence problem of whether or not they originate from the same source. This thesis investigates these questions using ideal Bayesian observers as the underlying theoretical approach. In particular, the latter correspondence task is treated as a problem of Bayesian model selection or structure inference in Bayesian networks. This approach provides a unified and principled way of representing and understanding the perceptual problems faced by humans and machines and their commonality. In the domain of machine intelligence, we exploit the developed theory for practical benefit, developing a model to represent audio-visual correlations. Unsupervised learning in this model provides automatic calibration and user appearance learning, without human intervention. Inference in the model involves explicit reasoning about the association between latent sources and observations. This provides audio-visual tracking through occlusion with improved accuracy compared to standard techniques. It also provides detection, verification and speech segmentation, ultimately allowing the machine to understand ``who said what, where?'' in multi-party conversations. In the domain of human neuroscience, we show how a variety of recent results in multimodal perception can be understood as the consequence of probabilistic reasoning about the causal structure of multimodal observations. We show this for a localisation task in audio-visual psychophysics, which is very similar to the task solved by our machine learning system. We also use the same theory to understand results from experiments in the completely different paradigm of oddity detection using visual and haptic modalities. These results begin to suggest that the human perceptual system performs -- or at least approximates -- sophisticated probabilistic reasoning about the causal structure of observations under the hood.
|
457 |
Inference dynamics in transcriptional regulationAsif, Hafiz Muhammad Shahzad January 2012 (has links)
Computational systems biology is an emerging area of research that focuses on understanding the holistic view of complex biological systems with the help of statistical, mathematical and computational techniques. The regulation of gene expression in gene regulatory network is a fundamental task performed by all known forms of life. In this subsystem, modelling the behaviour of the components and their interactions can provide useful biological insights. Statistical approaches for understanding biological phenomena such as gene regulation are proving to be useful for understanding the biological processes that are otherwise not comprehensible due to multitude of information and experimental difficulties. A combination of both the experimental and computational biology can potentially lead to system level understanding of biological systems. This thesis focuses on the problem of inferring the dynamics of gene regulation from the observed output of gene expression. Understanding of the dynamics of regulatory proteins in regulating the gene expression is a fundamental task in elucidating the hidden regulatory mechanisms. For this task, an initial fixed structure of the network is obtained using experimental biology techniques. Given this network structure, the proposed inference algorithms make use of the expression data to predict the latent dynamics of transcription factor proteins. The thesis starts with an introductory chapter that familiarises the reader with the physical entities in biological systems; then we present the basic framework for inference in transcriptional regulation and highlight the main features of our approach. Then we introduce the methods and techniques that we use for inference in biological networks in chapter 2; it sets the foundation for the remaining chapters of the thesis. Chapter 3 describes four well-known methods for inference in transcriptional regulation with pros and cons of each method. Main contributions of the thesis are presented in the following three chapters. Chapter 4 describes a model for inference in transcriptional regulation using state space models. We extend this method to cope with the expression data obtained from multiple independent experiments where time dynamics are not present. We believe that the time has arrived to package methods like these into customised software packages tailored for biologists for analysing the expression data. So, we developed an open-sources, platform independent implementation of this method (TFInfer) that can process expression measurements with biological replicates to predict the activities of proteins and their influence on gene expression in gene regulatory network. The proteins in the regulatory network are known to interact with one another in regulating the expression of their downstream target genes. To take this into account, we propose a novel method to infer combinatorial effect of the proteins on gene expression using a variant of factorial hidden Markov model. We describe the inference mechanism in combinatorial factorial hidden model (cFHMM) using an efficient variational Bayesian expectation maximisation algorithm. We study the performance of the proposed model using simulated data analysis and identify its limitation in different noise conditions; then we use three real expression datasets to find the extent of combinatorial transcriptional regulation present in these datasets. This constitutes chapter 5 of the thesis. In chapter 6, we focus on problem of inferring the groups of proteins that are under the influence of same external signals and thus have similar effects on their downstream targets. Main objectives for this work are two fold: firstly, identifying the clusters of proteins with similar dynamics indicate their role is specific biological mechanisms and therefore potentially useful for novel biological insights; secondly, clustering naturally leads to better estimation of the transition rates of activity profiles of the regulatory proteins. The method we propose uses Dirichlet process mixtures to cluster the latent activity profiles of regulatory proteins that are modelled as latent Markov chain of a factorial hidden Markov model; we refer to this method as DPM-FHMM. We extensively test our methods using simulated and real datasets and show that our model shows better results for inference in transcriptional regulation compared to a standard factorial hidden Markov model. In the last chapter, we present conclusions about the work presented in this thesis and propose future directions for extending this work.
|
458 |
Estimation et Classification de Signaux Altimétriques / Estimation and Classification of Altimetric SignalsSeverini, Jérôme 07 October 2010 (has links)
La mesure de la hauteur des océans, des vents de surface (fortement liés aux températures des océans), ou encore de la hauteur des vagues sont un ensemble de paramètres nécessaires à l'étude des océans mais aussi au suivi de leurs évolutions : l'altimétrie spatiale est l'une des disciplines le permettant. Une forme d'onde altimétrique est le résultat de l'émission d'une onde radar haute fréquence sur une surface donnée (classiquement océanique) et de la mesure de la réflexion de cette onde. Il existe actuellement une méthode d'estimation non optimale des formes d'onde altimétriques ainsi que des outils de classifications permettant d'identifier les différents types de surfaces observées. Nous proposons dans cette étude d'appliquer la méthode d'estimation bayésienne aux formes d'onde altimétriques ainsi que de nouvelles approches de classification. Nous proposons enfin la mise en place d'un algorithme spécifique permettant l'étude de la topographie en milieu côtier, étude qui est actuellement très peu développée dans le domaine de l'altimétrie. / After having scanned the ocean levels during thirteen years, the french/american satelliteTopex-Poséidon disappeared in 2005. Topex-Poséidon was replaced by Jason-1 in december 2001 and a new satellit Jason-2 is waited for 2008. Several estimation methods have been developed for signals resulting from these satellites. In particular, estimators of the sea height and wave height have shown very good performance when they are applied on waveforms backscattered from ocean surfaces. However, it is a more challenging problem to extract relevant information from signals backscattered from non-oceanic surfaces such as inland waters, deserts or ices. This PhD thesis is divided into two parts : A first direction consists of developing classification methods for altimetric signals in order to recognize the type of surface affected by the radar waveform. In particular, a specific attention will be devoted to support vector machines (SVMs) and functional data analysis for this problem. The second part of this thesis consists of developing estimation algorithms appropriate to altimetric signals obtained after reflexion on non-oceanic surfaces. Bayesian algorithms are currently under investigation for this estimation problem. This PhD is co-supervised by the french society CLS (Collect Localisation Satellite) (seehttp://www.cls.fr/ for more details) which will in particular provide the real altimetric data necessary for this study.
|
459 |
Novel Sensing and Inference Techniques in Air and Water EnvironmentsZhou, Xiaochi January 2015 (has links)
<p>Environmental sensing is experiencing tremendous development due largely to the advancement of sensor technology and wireless technology/internet that connects them and enable data exchange. Environmental monitoring sensor systems range from satellites that continuously monitor earth surface to miniature wearable devices that track local environment and people's activities. However, transforming these data into knowledge of the underlying physical and/or chemical processes remains a big challenge given the spatial, temporal scale, and heterogeneity of the relevant natural phenomena. This research focuses on the development and application of novel sensing and inference techniques in air and water environments. The overall goal is to infer the state and dynamics of some key environmental variables by building various models: either a sensor system or numerical simulations that capture the physical processes.</p><p>This dissertation is divided into five chapters. Chapter 1 introduces the background and motivation of this research. Chapter 2 focuses on the evaluation of different models (physically-based versus empirical) and remote sensing data (multispectral versus hyperspectral) for suspended sediment concentration (SSC) retrieval in shallow water environments. The study site is the Venice lagoon (Italy), where we compare the estimated SSC from various models and datasets against in situ probe measurements. The results showed that the physically-based model provides more robust estimate of SSC compared against empirical models when evaluated using the cross-validation method (leave-one-out). Despite the finer spectral resolution and the choice of optimal combinations of bands, the hyperspectral data is less reliable for SSC retrieval comparing to multispectral data due to its limited amount of historical dataset, information redundancy, and cross-band correlation.</p><p>Chapter 3 introduces a multipollutant sensor/sampler system that developed for use on mobile applications including aerostats and unmanned aerial vehicles (UAVs). The system is particularly applicable to open area sources such as forest fires, due to its light weight (3.5 kg), compact size (6.75 L), and internal power supply. The sensor system, termed “Kolibri”, consists of low-cost sensors measuring CO2 and CO, and samplers for particulate matter and volatile organic compounds (VOCs). The Kolibri is controlled by a microcontroller, which can record and transfer data in real time using a radio module. Selection of the sensors was based on laboratory testing for accuracy, response delay and recovery, cross-sensitivity, and precision. The Kolibri was compared against rack-mounted continuous emission monitors (CEMs) and another mobile sampling instrument (the ``Flyer'') that had been used in over ten open area pollutant sampling events. Our results showed that the time series of CO, CO2, and PM2.5 concentrations measured by the Kolibri agreed well with those from the CEMs and the Flyer. The VOC emission factors obtained using the Kolibri are comparable to existing literature values. The Kolibri system can be applied to various open area sampling challenging situations such as fires, lagoons, flares, and landfills.</p><p>Chapter 4 evaluates the trade-off between sensor quality and quantity for fenceline monitoring of fugitive emissions. This research is motivated by the new air quality standard that requires continuous monitoring of hazardous air pollutants (HAPs) along the fenceline of oil and gas refineries. Recently, the emergence of low-cost sensors enables the implementation of spatially-dense sensor network that can potentially compensate for the low quality of individual sensors. To quantify sensor inaccuracy and uncertainty of describing gas concentration that is governed by turbulent air flow, a Bayesian approach is applied to probabilistically infer the leak source and strength. Our results show that a dense sensor network can partly compensate for low-sensitivity or high noise of individual sensors. However, the fenceline monitoring approach fails to make an accurate leak detection when sensor/wind bias exists even with a dense sensor network.</p><p>Chapter 5 explores the feasibility of applying a mobile sensing approach to estimate fugitive methane emissions in suburban and rural environments. We first compare the mobile approach against a stationary method (OTM33A) proposed by the US EPA using a series of controlled release tests. Analysis shows that the mobile sensing approach can reduce estimation bias and uncertainty compared against the OTM33A method. Then, we apply this mobile sensing approach to quantify fugitive emissions from several ammonia fertilizer plants in rural areas. Significant methane emission was identified from one plant while the other two shows relatively low emissions. Sensitivity analysis of several model parameters shows that the error term in the Bayesian inference is vital for the determination of model uncertainty while others are less influential. Overall, this mobile sensing approach shows promising results for future applications of quantifying fugitive methane emission in suburban and rural environments.</p> / Dissertation
|
460 |
Inferential Set Adoption by Nursing StudentsGarza, Christine Seftchick 08 1900 (has links)
This study examines nursing students' adoption of inferential sets in a clinical situation. The investigation determines (1) the particular inferential set(s) nursing students adopt toward a patient in a clinical situation; (2) the particular inferential set(s) adopted by sophomore and senior nursing students in a clinical situation; and (3) whether or not inferential sets adopted by the sophomore and senior nursing students differ. Sophomore and senior nursing students at a woman's university in Texas were asked to complete a research tool designed to determine inferential set adoption.
|
Page generated in 0.0588 seconds