Global ETD Search

41	Bayesian Methods in Nutrition Epidemiology and Regression-based Predictive Models in Healthcare Zhang, Saijuan 2010 December 1900 (has links) This dissertation has mainly two parts. In the first part, we propose a bivariate nonlinear multivariate measurement error model to understand the distribution of dietary intake and extend it to a multivariate model to capture dietary patterns in nutrition epidemiology. In the second part, we propose regression-based predictive models to accurately predict surgery duration in healthcare. Understanding the distribution of episodically consumed dietary components is an important problem in public health. Short-term measurements of episodically consumed dietary components are zero-inflated skewed distributions. So-called two-part models have been developed for such data. However, there is much greater public health interest in the usual intake adjusted for caloric intake. Recently a nonlinear mixed effects model has been developed and fit by maximum likelihood using nonlinear mixed effects programs. However, the fitting is slow and unstable. We develop a Monte-Carlo-based fitting method in Chapter II. We demonstrate numerically that our methods lead to increased speed of computation, converge to reasonable solutions, and have the flexibility to be used in either a frequentist or a Bayesian manner. Diet consists of numerous foods, nutrients and other components, each of which have distinctive attributes. Increasingly nutritionists are interested in exploring them collectively to capture overall dietary patterns. We thus extend the bivariate model described in Chapter III to multivariate level. We use survey-weighted MCMC computations to fit the model, with uncertainty estimation coming from balanced repeated replication. The methodology is illustrated through an application of estimating the population distribution of the Healthy Eating Index-2005 (HEI-2005), a multi-component dietary quality index , among children aged 2-8 in the United States. The second part of this dissertation is to accurately predict surgery duration. Prior research has identified the current procedural terminology (CPT) codes as the most important factor when predicting surgical case durations but there has been little reporting of a general predictive methodology using it effectively. In Chapter IV, we propose two regression-based predictive models. However, the naively constructed design matrix is singular. We thus devise a systematic procedure to construct a fullranked design matrix. Using surgical data from a central Texas hospital, we compare the proposed models with a few benchmark methods and demonstrate that our models lead to a remarkable reduction in prediction errors. Bayesian methods Measurement error Mixed models Nutrition epidemiology Predictive Model Health Care Regression
42	Segmentation Of Human Facial Muscles On Ct And Mri Data Using Level Set And Bayesian Methods Kale, Hikmet Emre 01 July 2011 (has links) (PDF) Medical image segmentation is a challenging problem, and is studied widely. In this thesis, the main goal is to develop automatic segmentation techniques of human mimic muscles and to compare them with ground truth data in order to determine the method that provides best segmentation results. The segmentation methods are based on Bayesian with Markov Random Field (MRF) and Level Set (Active Contour) models. Proposed segmentation methods are multi step processes including preprocess, main muscle segmentation step and post process, and are applied on three types of data: Magnetic Resonance Imaging (MRI) data, Computerized Tomography (CT) data and unified data, in which case, information coming from both modalities are utilized. The methods are applied both in three dimensions (3D) and two dimensions (2D) data cases. A simulation data and two patient data are utilized for tests. The patient data results are compared statistically with ground truth data which was labeled by an expert radiologist. QA Numerical Analysis 297-299.4
43	Bayesian assessment of newborn brain maturity from sleep electroencephalograms Jakaite, Livija January 2012 (has links) In this thesis, we develop and test a technology for computer-assisted assessments of newborn brain maturity from sleep electroencephalogram (EEG). Brain maturation of newborns is reflected in rapid development of EEG patterns over a number of weeks after conception. Observing the maturational patterns, experts can assess newborn’s EEG maturity with an accuracy ±2 weeks of newborn’s stated age. A mismatch between the EEG patterns and newborn’s physiological age alerts clinicians about possible neurological problems. Analysis of newborn EEG requires specialised skills to recognise the maturity-related waveforms and patterns and interpret them in the context of newborns age and behavioural state. It is highly desirable to make the results of maturity assessment most accurate and reliable. However, the expert analysis is limited in capability to estimate the uncertainty in assessments. To enable experts quantitatively evaluate risks of brain dysmaturity for each case, we employ the Bayesian model averaging methodology. This methodology, in theory, provides the most accurate assessments along with the estimates of uncertainty, enabling experts to take into account the full information about the risk of decision making. Such information is particularly important when assessing the EEG signals which are highly variable and corrupted by artefacts. The use of decision tree models within the Bayesian averaging enables interpreting the results as a set of rules and finding the EEG features which make the most important contribution to assessments. The developed technology was tested on approximately 1,000 EEG recordings of newborns aged 36 to 45 weeks post conception, and the accuracy of assessments was comparable to that achieved by EEG experts. In addition, it was shown that the Bayesian assessment can be used to quantitatively evaluate the risk of brain dysmaturity for each EEG recording. 618.92
44	Function-on-Function Regression with Public Health Applications Meyer, Mark John 06 June 2014 (has links) Medical research currently involves the collection of large and complex data. One such type of data is functional data where the unit of measurement is a curve measured over a grid. Functional data comes in a variety of forms depending on the nature of the research. Novel methodologies are required to accommodate this growing volume of functional data alongside new testing procedures to provide valid inferences. In this dissertation, I propose three novel methods to accommodate a variety of questions involving functional data of multiple forms. I consider three novel methods: (1) a function-on-function regression for Gaussian data; (2) a historical functional linear models for repeated measures; and (3) a generalized functional outcome regression for ordinal data. For each method, I discuss the existing shortcomings of the literature and demonstrate how my method fills those gaps. The abilities of each method are demonstrated via simulation and data application. Biostatistics Bayesian Inference Bayesian Methods Functional Data Analysis Function-on-Function Regression Wavelet Regression
45	Generalized Probabilistic Topic and Syntax Models for Natural Language Processing Darling, William Michael 14 September 2012 (has links) This thesis proposes a generalized probabilistic approach to modelling document collections along the combined axes of both semantics and syntax. Probabilistic topic (or semantic) models view documents as random mixtures of unobserved latent topics which are themselves represented as probabilistic distributions over words. They have grown immensely in popularity since the introduction of the original topic model, Latent Dirichlet Allocation (LDA), in 2004, and have seen successes in computational linguistics, bioinformatics, political science, and many other fields. Furthermore, the modular nature of topic models allows them to be extended and adapted to specific tasks with relative ease. Despite the recorded successes, however, there remains a gap in combining axes of information from different sources and in developing models that are as useful as possible for specific applications, particularly in Natural Language Processing (NLP). The main contributions of this thesis are two-fold. First, we present generalized probabilistic models (both parametric and nonparametric) that are semantically and syntactically coherent and contain many simpler probabilistic models as special cases. Our models are consistent along both axes of word information in that an LDA-like component sorts words that are semantically related into distinct topics and a Hidden Markov Model (HMM)-like component determines the syntactic parts-of-speech of words so that we can group words that are both semantically and syntactically affiliated in an unsupervised manner, leading to such groups as verbs about health care and nouns about sports. Second, we apply our generalized probabilistic models to two NLP tasks. Specifically, we present new approaches to automatic text summarization and unsupervised part-of-speech (POS) tagging using our models and report results commensurate with the state-of-the-art in these two sub-fields. Our successes demonstrate the general applicability of our modelling techniques to important areas in computational linguistics and NLP. NLP topic models machine learning bayesian methods text summarization part-of-speech tagging syntax modelling
46	Some Recent Advances in Non- and Semiparametric Bayesian Modeling with Copulas, Mixtures, and Latent Variables Murray, Jared January 2013 (has links) <p>This thesis develops flexible non- and semiparametric Bayesian models for mixed continuous, ordered and unordered categorical data. These methods have a range of possible applications; the applications considered in this thesis are drawn primarily from the social sciences, where multivariate, heterogeneous datasets with complex dependence and missing observations are the norm. </p><p>The first contribution is an extension of the Gaussian factor model to Gaussian copula factor models, which accommodate continuous and ordinal data with unspecified marginal distributions. I describe how this model is the most natural extension of the Gaussian factor model, preserving its essential dependence structure and the interpretability of factor loadings and the latent variables. I adopt an approximate likelihood for posterior inference and prove that, if the Gaussian copula model is true, the approximate posterior distribution of the copula correlation matrix asymptotically converges to the correct parameter under nearly any marginal distributions. I demonstrate with simulations that this method is both robust and efficient, and illustrate its use in an application from political science.</p><p>The second contribution is a novel nonparametric hierarchical mixture model for continuous, ordered and unordered categorical data. The model includes a hierarchical prior used to couple component indices of two separate models, which are also linked by local multivariate regressions. This structure effectively overcomes the limitations of existing mixture models for mixed data, namely the overly strong local independence assumptions. In the proposed model local independence is replaced by local conditional independence, so that the induced model is able to more readily adapt to structure in the data. I demonstrate the utility of this model as a default engine for multiple imputation of mixed data in a large repeated-sampling study using data from the Survey of Income and Participation. I show that it improves substantially on its most popular competitor, multiple imputation by chained equations (MICE), while enjoying certain theoretical properties that MICE lacks. </p><p>The third contribution is a latent variable model for density regression. Most existing density regression models are quite flexible but somewhat cumbersome to specify and fit, particularly when the regressors are a combination of continuous and categorical variables. The majority of these methods rely on extensions of infinite discrete mixture models to incorporate covariate dependence in mixture weights, atoms or both. I take a fundamentally different approach, introducing a continuous latent variable which depends on covariates through a parametric regression. In turn, the observed response depends on the latent variable through an unknown function. I demonstrate that a spline prior for the unknown function is quite effective relative to Dirichlet Process mixture models in density estimation settings (i.e., without covariates) even though these Dirichlet process mixtures have better theoretical properties asymptotically. The spline formulation enjoys a number of computational advantages over more flexible priors on functions. Finally, I demonstrate the utility of this model in regression applications using a dataset on U.S. wages from the Census Bureau, where I estimate the return to schooling as a smooth function of the quantile index.</p> / Dissertation Statistics Bayesian methods Bayesian Nonparametrics Copula Modeling Density Regression Mixture Modeling Multiple Imputation
47	Population Differentiation, Historical Demography and Evolutionary Relationships Among Widespread Common Chaffinch Populations (Fringilla coelebs ssp.) Samarasin-Dissanayake, Pasan 28 July 2010 (has links) Widespread species that occupy continents and oceanic islands provide an excellent opportunity to study evolutionary forces responsible for population divergence. Here, I use multilocus coalescent based population genetic and phylogenetic methods to infer the evolutionary history of the common chaffinch (Fringilla coelebs), a widespread Palearctic passerine species. My results showed strong population structure between Atlantic islands. However, the two European subspecies can be considered one panmictic population based on gene flow estimates. My investigation of effects of sampling on concatenated and Bayesian estimation of species tree (BEST) methods demonstrated that concatenation is more sensitive to sampling than BEST. Furthermore, concatenation can provide incorrect evolutionary relationships with high confidence when sample size is small. In conclusion, my results suggest European ancestry for the common chaffinch and Atlantic islands appear to have been colonized sequentially from north to south via Azores. coalescent theory population structure colonization demographic history Bayesian methods incomplete lineage sorting species trees 0472
48	Tumor Gene Expression Purification Using Infinite Mixture Topic Models Deshwar, Amit Gulab 11 July 2013 (has links) There is significant interest in using gene expression measurements to aid in the personalization of medical treatment. The presence of significant normal tissue contamination in tumor samples makes it difficult to use tumor expression measurements to predict clinical variables and treatment response. I present a probabilistic method, TMMpure, to infer the expression profile of the cancerous tissue using a modified topic model that contains a hierarchical Dirichlet process prior on the cancer profiles. I demonstrate that TMMpure is able to infer the expression profile of cancerous tissue and improves the power of predictive models for clinical variables using expression profiles. Bayesian methods Gene expression purficiation Bayesian Nonparametric Topic models 0984 0800 0544
49	Population Differentiation, Historical Demography and Evolutionary Relationships Among Widespread Common Chaffinch Populations (Fringilla coelebs ssp.) Samarasin-Dissanayake, Pasan 28 July 2010 (has links) Widespread species that occupy continents and oceanic islands provide an excellent opportunity to study evolutionary forces responsible for population divergence. Here, I use multilocus coalescent based population genetic and phylogenetic methods to infer the evolutionary history of the common chaffinch (Fringilla coelebs), a widespread Palearctic passerine species. My results showed strong population structure between Atlantic islands. However, the two European subspecies can be considered one panmictic population based on gene flow estimates. My investigation of effects of sampling on concatenated and Bayesian estimation of species tree (BEST) methods demonstrated that concatenation is more sensitive to sampling than BEST. Furthermore, concatenation can provide incorrect evolutionary relationships with high confidence when sample size is small. In conclusion, my results suggest European ancestry for the common chaffinch and Atlantic islands appear to have been colonized sequentially from north to south via Azores. coalescent theory population structure colonization demographic history Bayesian methods incomplete lineage sorting species trees 0472
50	Tumor Gene Expression Purification Using Infinite Mixture Topic Models Deshwar, Amit Gulab 11 July 2013 (has links) There is significant interest in using gene expression measurements to aid in the personalization of medical treatment. The presence of significant normal tissue contamination in tumor samples makes it difficult to use tumor expression measurements to predict clinical variables and treatment response. I present a probabilistic method, TMMpure, to infer the expression profile of the cancerous tissue using a modified topic model that contains a hierarchical Dirichlet process prior on the cancer profiles. I demonstrate that TMMpure is able to infer the expression profile of cancerous tissue and improves the power of predictive models for clinical variables using expression profiles. Bayesian methods Gene expression purficiation Bayesian Nonparametric Topic models 0984 0800 0544

Search results