Spelling suggestions: "subject:"bayesian hierarchical models"" "subject:"eayesian hierarchical models""
1 |
Bayesian Analysis of Crime Survey Data with NonresponseLiu, Shiao 26 April 2018 (has links)
Bayesian hierarchical models are effective tools for small area estimation by pooling small datasets together. The pooling procedures allow individual areas to “borrow strength” from each other to desirably improve the estimation. This work is an extension of Nandram and Choi (2002), NC, to perform inference on finite population proportions when there exists non-identifiability of the missing pattern for nonresponse in binary survey data. We review the small-area selection model (SSM) in NC which is able to incorporate the non-identifiability. Moreover, the proposed SSM, together with the individual-area selection model (ISM), and the small-area pattern-mixture model (SPM) are evaluated by real crime data in Stasny (1991). Furthermore, the methodology is compared to ISM and SPM using simulated small area datasets. Computational issues related to the MCMC are also discussed.
|
2 |
Predicting Maximal Oxygen Consumption (VO2max) Levels in AdolescentsShepherd, Brent A. 09 March 2012 (has links) (PDF)
Maximal oxygen consumption (VO2max) is considered by many to be the best overall measure of an individual's cardiovascular health. Collecting the measurement, however, requires subjecting an individual to prolonged periods of intense exercise until their maximal level, the point at which their body uses no additional oxygen from the air despite increased exercise intensity, is reached. Collecting VO2max data also requires expensive equipment and great subject discomfort to get accurate results. Because of this inherent difficulty, it is often avoided despite its usefulness. In this research, we propose a set of Bayesian hierarchical models to predict VO2max levels in adolescents, ages 12 through 17, using less extreme measurements. Two models are developed separately, one that uses submaximal exercise data and one that uses physical fitness questionnaire data. The best submaximal model was found to include age, gender, BMI, heart rate, rate of perceived exertion, treadmill miles per hour, and an interaction between age and heart rate. The second model, designed for those with physical limitations, uses age, gender, BMI, and two separate questionnaire results measuring physical activity levels and functional ability levels, as well as an interaction between the physical activity level score and gender. Both models use separate model variances for males and females.
|
3 |
Bayesian modeling of neuropsychological test scoresDu, Mengtian 06 October 2021 (has links)
In this dissertation we propose novel Bayesian methods of analysis of patterns of neuropsychological testing. We first focus attention to situations in which the goal of the analysis is to discover risk factors of cognitive decline using longitudinal assessment of tests scores. Variable selection in the Bayesian setting is still challenging, particularly for analysis of longitudinal data. We propose a novel approach to selection of the fixed effects in mixed effect models that combines a backward selection algorithm and a metrics based on the posterior credible intervals of the model parameters. The heuristic of this approach is based on searching for those parameters that are most likely to be different from zero based on their posterior credible intervals, without requiring ad hoc approximations of model parameters or informative prior distributions. We show via a simulation study that this approach produces more parsimonious models than other popular criteria such as the Bayesian deviance information criterion. We then apply this approach to test the hypothesis that genotypes of the APOE gene have different effects on the rate of cognitive decline of participants in the Long Life Family Study. In the second part of the dissertation we shift focus on analysis of neuropsychological tests administered using emerging digital technologies. The challenge of analyzing these data is that for each study participant the test is a data stream that records time and spatial coordinates of the digitally executed test and the goal is to extract some useful and informative summary univariate variables that can be used for analysis. Toward this goal, we propose a novel application of Bayesian Hidden Markov Models to analyze digitally recorded Trail Making Tests. Applying the Hidden Markov Model enables us to perform automatic segmentation of the digital data stream and allows us to extract meaningful metrics that correlate the Trail Making Tests performance to other cognitive and physical function test scores. We show that the extracted metrics provide information in addition to the traditionally used scores. / 2023-10-06T00:00:00Z
|
4 |
Biological network models for inferring mechanism of action, characterizing cellular phenotypes, and predicting drug responseGriffin, Paula Jean 13 February 2016 (has links)
A primary challenge in the analysis of high-throughput biological data is the abundance of correlated variables. A small change to a gene's expression or a protein's binding availability can cause significant downstream effects. The existence of such chain reactions presents challenges in numerous areas of analysis. By leveraging knowledge of the network interactions that underlie this type of data, we can often enable better understanding of biological phenomena. This dissertation will examine network-based statistical approaches to the problems of mechanism-of-action inference, characterization of gene expression changes, and prediction of drug response.
First, we develop a method for multi-target perturbation detection in multi-omics biological data. We estimate a joint Gaussian graphical model across multiple data types using penalized regression, and filter for network effects. Next, we apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. We also present a conditional testing procedure to allow for detection of secondary perturbations.
Second, we address the problem of characterization of cellular phenotypes via Bayesian regression in the Gene Ontology (GO). In our model, we use the structure of the GO to assign changes in gene expression to functional groups, and to model the covariance between these groups. In addition to describing changes in expression, we use these functional activity estimates to predict the expression of unobserved genes. We further determine when such predictions are likely to be inaccurate by identifying GO terms with poor agreement to gene-level estimates. In a case study, we identify GO terms relevant to changes in the growth rate of S. cerevisiae.
Lastly, we consider the prediction of drug sensitivity in cancer cell lines based on pathway-level activity estimates from ASSIGN, a Bayesian factor analysis model. We use penalized regression to predict response to various cancer treatments based on cancer subtype, pathway activity, and 2-way interactions thereof. We also present network representations of these interaction models and examine common patterns in their structure across treatments.
|
5 |
Separate and Joint Analysis of Longitudinal and Survival DataRajeev, Deepthi 21 March 2007 (has links) (PDF)
Chemotherapy is a method used to treat cancer but it has a number of side-effects. Research conducted by the Department of Chemical Engineering at BYU involves a new method of administering chemotherapy using ultrasound waves and water-soluble capsules. The goal is to reduce the side-effects by localizing the delivery of the medication. As part of this research, a two-factor experiment was conducted on rats to test if the water-soluble capsules and ultrasound waves by themselves have an effect on tumor growth or patient survival. Our project emphasizes the usage of Bayesian Hierarchical Models and Win-BUGS to jointly model the survival data and the longitudinal data—mass. The results of the joint analysis indicate that the use of ultrasound and water-soluble microcapsules have no negative effect on survival. In fact, there appears to be a positive effect on the survival since the rats in the ultrasound-capsule group had higher survival rates than the rats in other treatment groups. From these results, it does appear that the new technology involving ultrasound waves and microcapsules is a promising way to reduce the side-effects of chemotherapy. It is strongly advocated that the formulation of a joint model for any longitudinal and survival data be performed. For future work for the ultrasound-microcapsule data it is recommended that joint modeling of the mass, tumor volume, and survival data be conducted to obtain additional information.
|
6 |
Meta-Analysis Using Bayesian Hierarchical Models in Organizational BehaviorUlrich, Michael David 02 July 2009 (has links) (PDF)
Meta-analysis is a tool used to combine the results from multiple studies into one comprehensive analysis. First developed in the 1970s, meta-analysis is a major statistical method in academic, medical, business, and industrial research. There are three traditional ways in which a meta-analysis is conducted: fixed or random effects, and using an empirical Bayesian approach. Derivations for conducting meta-analysis on correlations in the industrial psychology and organizational behavior (OB) discipline were reviewed by Hunter and Schmidt (2004). In this approach, Hunter and Schmidt propose an empirical Bayesian analysis where the results from previous studies are used as a prior. This approach is still widely used in OB despite recent advances in Bayesian methodology. This paper presents the results of a hierarchical Bayesian model for conducting meta-analysis of correlations and then compares these results to a traditional Hunter-Schmidt analysis conducted by Judge et al. (2001). In our approach we treat the correlations from previous studies as a likelihood, and present a prior distribution for correlations.
|
7 |
Bayesian Hierarchical Models for Partially Observed DataJaberansari, Negar January 2016 (has links)
No description available.
|
8 |
Bayesian Hierarchical Methods and the Use of Ecological Thresholds and Changepoints for Habitat Selection ModelsPooler, Penelope S. 03 January 2006 (has links)
Modeling the complex relationships between habitat characteristics and a species' habitat preferences pose many difficult problems for ecological researchers. These problems are complicated further when information is collected over a range of time or space. Additionally, the variety of factors affecting these choices is difficult to understand and even more difficult to accurately collect information about. In light of these concerns, we evaluate the performance of current standard habitat preference models that are based on Bayesian methods and then present some extensions and supplements to those methods that prove to be very useful. More specifically, we demonstrate the value of extending the standard Bayesian hierarchical model using finite mixture model methods. Additionally, we demonstrate that an extension of the Bayesian hierarchical changepoint model to allow for estimating multiple changepoints simultaneously can be very informative when applied to data about multiple habitat locations or species. These models allow the researcher to compare the sites or species with respect to a very specific ecological question and consequently provide definitive answers that are often not available with more commonly used models containing many explanatory factors. Throughout our work we use a complex data set containing information about horseshoe crab spawning habitat preferences in the Delaware Bay over a five-year period. These data epitomize some of the difficult issues inherent to studying habitat preferences. The data are collected over time at many sites, have missing observations, and include explanatory variables that, at best, only provide surrogate information for what researchers feel is important in explaining spawning preferences throughout the bay. We also looked at a smaller data set of freshwater mussel habitat selection preferences in relation to bridge construction on the Kennerdell River in Western Pennsylvania. Together, these two data sets provided us with insight in developing and refining the methods we present. They also help illustrate the strengths and weaknesses of the methods we discuss by assessing their performance in real situations where data are inevitably complex and relationships are difficult to discern. / Ph. D.
|
9 |
Bayesian model estimation and comparison for longitudinal categorical dataTran, Thu Trung January 2008 (has links)
In this thesis, we address issues of model estimation for longitudinal categorical data and of model selection for these data with missing covariates. Longitudinal survey data capture the responses of each subject repeatedly through time, allowing for the separation of variation in the measured variable of interest across time for one subject from the variation in that variable among all subjects. Questions concerning persistence, patterns of structure, interaction of events and stability of multivariate relationships can be answered through longitudinal data analysis. Longitudinal data require special statistical methods because they must take into account the correlation between observations recorded on one subject. A further complication in analysing longitudinal data is accounting for the non- response or drop-out process. Potentially, the missing values are correlated with variables under study and hence cannot be totally excluded. Firstly, we investigate a Bayesian hierarchical model for the analysis of categorical longitudinal data from the Longitudinal Survey of Immigrants to Australia. Data for each subject is observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. Secondly, we examine the Bayesian model selection techniques of the Bayes factor and Deviance Information Criterion for our regression models with miss- ing covariates. Computing Bayes factors involve computing the often complex marginal likelihood p(y|model) and various authors have presented methods to estimate this quantity. Here, we take the approach of path sampling via power posteriors (Friel and Pettitt, 2006). The appeal of this method is that for hierarchical regression models with missing covariates, a common occurrence in longitudinal data analysis, it is straightforward to calculate and interpret since integration over all parameters, including the imputed missing covariates and the random effects, is carried out automatically with minimal added complexi- ties of modelling or computation. We apply this technique to compare models for the employment status of immigrants to Australia. Finally, we also develop a model choice criterion based on the Deviance In- formation Criterion (DIC), similar to Celeux et al. (2006), but which is suitable for use with generalized linear models (GLMs) when covariates are missing at random. We define three different DICs: the marginal, where the missing data are averaged out of the likelihood; the complete, where the joint likelihood for response and covariates is considered; and the naive, where the likelihood is found assuming the missing values are parameters. These three versions have different computational complexities. We investigate through simulation the performance of these three different DICs for GLMs consisting of normally, binomially and multinomially distributed data with missing covariates having a normal distribution. We find that the marginal DIC and the estimate of the effective number of parameters, pD, have desirable properties appropriately indicating the true model for the response under differing amounts of missingness of the covariates. We find that the complete DIC is inappropriate generally in this context as it is extremely sensitive to the degree of missingness of the covariate model. Our new methodology is illustrated by analysing the results of a community survey.
|
10 |
The relationship between traffic congestion and road accidents : an econometric approach using GISWang, Chao January 2010 (has links)
Both traffic congestion and road accidents impose a burden on society, and it is therefore important for transport policy makers to reduce their impact. An ideal scenario would be that traffic congestion and accidents are reduced simultaneously, however, this may not be possible since it has been speculated that increased traffic congestion may be beneficial in terms of road safety. This is based on the premise that there would be fewer fatal accidents and the accidents that occurred would tend to be less severe due to the low average speed when congestion is present. If this is confirmed then it poses a potential dilemma for transport policy makers: the benefit of reducing congestion might be off-set by more severe accidents. It is therefore important to fully understand the relationship between traffic congestion and road accidents while controlling for other factors affecting road traffic accidents. The relationship between traffic congestion and road accidents appears to be an under researched area. Previous studies often lack a suitable congestion measurement and an appropriate econometric model using real-world data. This thesis aims to explore the relationship between traffic congestion and road accidents by using an econometric and GIS approach. The analysis is based on the data from the M25 motorway and its surrounding major roads for the period 2003-2007. A series of econometric models have been employed to investigate the effect of traffic congestion on both accident frequency (such as classical Negative Binomial and Bayesian spatial models) and accident severity (such as ordered logit and mixed logit models). The Bayesian spatial model and the mixed logit model are the best models estimated for accident frequency and accident severity analyses respectively. The model estimation results suggest that traffic congestion is positively associated with the frequency of fatal and serious injury accidents and negatively (i.e. inversely) associated with the severity of accidents that have occurred. Traffic congestion is found to have little impact on the frequency of slight injury accidents. Other contributing factors have also been controlled for and produced results consistent with previous studies. It is concluded that traffic congestion overall has a negative impact on road safety. This may be partially due to higher speed variance among vehicles within and between lanes and erratic driving behaviour in the presence of congestion. The results indicate that mobility and safety can be improved simultaneously, and therefore there is significant additional benefit of reducing traffic congestion in terms of road safety. Several policy implications have been identified in order to optimise the traffic flow and improve driving behaviour, which would be beneficial to both congestion and accident reduction. This includes: reinforcing electronic warning signs and the Active Traffic Management, enforcing average speed on a stretch of a roadway and introducing minimum speed limits in the UK. This thesis contributes to knowledge in terms of the relationship between traffic congestion and road accidents, showing that mobility and safety can be improved simultaneously. A new hypothesis is proposed that traffic congestion on major roads may increase the occurrence of serious injury accidents. This thesis also proposes a new map-matching technique so as to assign accidents to the correct road segments, and shows how a two-stage modelling process which combines both accident frequency and severity models can be used in site ranking with the objective of identifying hazardous accident hotspots for further safety examination and treatment.
|
Page generated in 0.0737 seconds