• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 11
  • 11
  • 8
  • 1
  • 1
  • Tagged with
  • 62
  • 62
  • 33
  • 17
  • 14
  • 12
  • 10
  • 10
  • 10
  • 10
  • 9
  • 9
  • 9
  • 8
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Summarizing FLARE assay images in colon carcinogenesis

Leyk Williams, Malgorzata 12 April 2006 (has links)
Intestinal tract cancer is one of the more common cancers in the United States. While in some individuals a genetic component causes the cancer, the rate of cancer in the remainder of the population is believed to be affected by diet. Since cancer usually develops slowly, the amount of oxidative damage to DNA can be used as a cancer biomarker. This dissertation examines effective ways of analyzing FLARE assay data, which quantifies oxidative damage. The statistical methods will be implemented on data from a FLARE assay experiment, which examines cells from the duodenum and the colon to see if there is a difference in the risk of cancer due to corn or fish oil diets. Treatments of the oxidizing agent dextran sodium sulfate (DSS), DSS with a recovery period, as well as a control will also be used. Previous methods presented in the literature examined the FLARE data by summarizing the DNA damage of each cell with a single number, such as the relative tail moment (RTM). Variable skewness is proposed as an alternative measure, and shown to be as effective as the RTM in detecting diet and treatment differences in the standard analysis. The RTM and skewness data is then analyzed using a hierarchical model, with both the skewness and RTM showing diet/treatment differences. Simulated data for this model is also considered, and shows that a Bayes Factor (BF) for higher dimensional models does not follow guidelines presented by Kass and Raftery (1995). It is hypothesized that more information is obtained by describing the DNA damage functions, instead of summarizing them with a single number. From each function, seven points are picked. First, they are modeled independently, and only diet effects are found. However, when the correlation between points at the cell and rat level is modeled, much stronger diet and treatment differences are shown both in the colon and the duodenum than for any of the previous methods. These results are also easier to interpret and represent graphically, showing that the latter is an effective method of analyzing the FLARE data.
12

Ideology and interests : a hierarchical Bayesian approach to spatial party preferences

Mohanty, Peter Cushner 04 December 2013 (has links)
This paper presents a spatial utility model of support for multiple political parties. The model includes a "valence" term, which I reparameterize to include both party competence and the voters' key sociodemographic concerns. The paper shows how this spatial utility model can be interpreted as a hierarchical model using data from the 2009 European Elections Study. I estimate this model via Bayesian Markov Chain Monte Carlo (MCMC) using a block Gibbs sampler and show that the model can capture broad European-wide trends while allowing for significant amounts of heterogeneity. This approach, however, which assumes a normal dependent variable, is only able to partially reproduce the data generating process. I show that the data generating process can be reproduced more accurately with an ordered probit model. Finally, I discuss trade-offs between parsimony and descriptive richness and other practical challenges that may be encountered when v building models of party support and make recommendations for capturing the best of both approaches. / text
13

Bayesian learning methods for neural coding

Park, Mi Jung 27 January 2014 (has links)
A primary goal in systems neuroscience is to understand how neural spike responses encode information about the external world. A popular approach to this problem is to build an explicit probabilistic model that characterizes the encoding relationship in terms of a cascade of stages: (1) linear dimensionality reduction of a high-dimensional stimulus space using a bank of filters or receptive fields (RFs); (2) a nonlinear function from filter outputs to spike rate; and (3) a stochastic spiking process with recurrent feedback. These models have described single- and multi-neuron spike responses in a wide variety of brain areas. This dissertation addresses Bayesian methods to efficiently estimate the linear and non-linear stages of the cascade encoding model. In the first part, the dissertation describes a novel Bayesian receptive field estimator based on a hierarchical prior that flexibly incorporates knowledge about the shapes of neural receptive fields. This estimator achieves error rates several times lower than existing methods, and can be applied to a variety of other neural inference problems such as extracting structure in fMRI data. The dissertation also presents active learning frameworks developed for receptive field estimation incorporating a hierarchical prior in real-time neurophysiology experiments. In addition, the dissertation describes a novel low-rank model for the high dimensional receptive field, combined with a hierarchical prior for more efficient receptive field estimation. In the second part, the dissertation describes new models for neural nonlinearities using Gaussian processes (GPs) and Bayesian active learning algorithms in closed-loop neurophysiology experiments to rapidly estimate neural nonlinearities. The dissertation also presents several stimulus selection criteria and compare their performance in neural nonlinearity estimation. Furthermore, the dissertation presents a variation of the new models by including an additional latent Gaussian noise source, to infer the degree of over-dispersion in neural spike responses. The proposed model successfully captures various mean-variance relationships in neural spike responses and achieves higher prediction accuracy than previous models. / text
14

New Results in ell_1 Penalized Regression

Roualdes, Edward A. 01 January 2015 (has links)
Here we consider penalized regression methods, and extend on the results surrounding the l1 norm penalty. We address a more recent development that generalizes previous methods by penalizing a linear transformation of the coefficients of interest instead of penalizing just the coefficients themselves. We introduce an approximate algorithm to fit this generalization and a fully Bayesian hierarchical model that is a direct analogue of the frequentist version. A number of benefits are derived from the Bayesian persepective; most notably choice of the tuning parameter and natural means to estimate the variation of estimates – a notoriously difficult task for the frequentist formulation. We then introduce Bayesian trend filtering which exemplifies the benefits of our Bayesian version. Bayesian trend filtering is shown to be an empirically strong technique for fitting univariate, nonparametric regression. Through a simulation study, we show that Bayesian trend filtering reduces prediction error and attains more accurate coverage probabilities over the frequentist method. We then apply Bayesian trend filtering to real data sets, where our method is quite competitive against a number of other popular nonparametric methods.
15

Partition Models for Variable Selection and Interaction Detection

Jiang, Bo 27 September 2013 (has links)
Variable selection methods play important roles in modeling high-dimensional data and are key to data-driven scientific discoveries. In this thesis, we consider the problem of variable selection with interaction detection. Instead of building a predictive model of the response given combinations of predictors, we start by modeling the conditional distribution of predictors given partitions based on responses. We use this inverse modeling perspective as motivation to propose a stepwise procedure for effectively detecting interaction with few assumptions on parametric form. The proposed procedure is able to detect pairwise interactions among p predictors with a computational time of \(O(p)\) instead of \(O(p^2)\) under moderate conditions. We establish consistency of the proposed procedure in variable selection under a diverging number of predictors and sample size. We demonstrate its excellent empirical performance in comparison with some existing methods through simulation studies as well as real data examples. Next, we combine the forward and inverse modeling perspectives under the Bayesian framework to detect pleiotropic and epistatic effects in effects in expression quantitative loci (eQTLs) studies. We augment the Bayesian partition model proposed by Zhang et al. (2010) to capture complex dependence structure among gene expression and genetic markers. In particular, we propose a sequential partition prior to model the asymmetric roles played by the response and the predictors, and we develop an efficient dynamic programming algorithm for sampling latent individual partitions. The augmented partition model significantly improves the power in detecting eQTLs compared to previous methods in both simulations and real data examples pertaining to yeast. Finally, we study the application of Bayesian partition models in the unsupervised learning of transcription factor (TF) families based on protein binding microarray (PBM). The problem of TF subclass identification can be viewed as the clustering of TFs with variable selection on their binding DNA sequences. Our model provides simultaneous identification of TF families and their shared sequence preferences, as well as DNA sequences bound preferentially by individual members of TF families. Our analysis may aid in deciphering cis regulatory codes and determinants of protein-DNA binding specificity. / Statistics
16

Capture-recapture Estimation for Conflict Data and Hierarchical Models for Program Impact Evaluation

Mitchell, Shira Arkin 07 June 2014 (has links)
A relatively recent increase in the popularity of evidence-based activism has created a higher demand for statisticians to work on human rights and economic development projects. The statistical challenges of revealing patterns of violence in armed conflict require efficient use of the data, and careful consideration of the implications of modeling decisions on estimates. Impact evaluation of a complex economic development project requires a careful consideration of causality and transparency to donors and beneficiaries. In this dissertation, I compare marginal and conditional models for capture recapture, and develop new hierarchical models that accommodate challenges in data from the armed conflict in Colombia, and more generally, in many other capture recapture settings. Additionally, I propose a study design for a non-randomized impact evaluation of the Millennium Villages Project (MVP), to be carried out during my postdoctoral fellowship. The design includes small area estimation of baseline variables, propensity score matching, and hierarchical models for causal inference.
17

Bayesian modeling of neuropsychological test scores

Du, Mengtian 06 October 2021 (has links)
In this dissertation we propose novel Bayesian methods of analysis of patterns of neuropsychological testing. We first focus attention to situations in which the goal of the analysis is to discover risk factors of cognitive decline using longitudinal assessment of tests scores. Variable selection in the Bayesian setting is still challenging, particularly for analysis of longitudinal data. We propose a novel approach to selection of the fixed effects in mixed effect models that combines a backward selection algorithm and a metrics based on the posterior credible intervals of the model parameters. The heuristic of this approach is based on searching for those parameters that are most likely to be different from zero based on their posterior credible intervals, without requiring ad hoc approximations of model parameters or informative prior distributions. We show via a simulation study that this approach produces more parsimonious models than other popular criteria such as the Bayesian deviance information criterion. We then apply this approach to test the hypothesis that genotypes of the APOE gene have different effects on the rate of cognitive decline of participants in the Long Life Family Study. In the second part of the dissertation we shift focus on analysis of neuropsychological tests administered using emerging digital technologies. The challenge of analyzing these data is that for each study participant the test is a data stream that records time and spatial coordinates of the digitally executed test and the goal is to extract some useful and informative summary univariate variables that can be used for analysis. Toward this goal, we propose a novel application of Bayesian Hidden Markov Models to analyze digitally recorded Trail Making Tests. Applying the Hidden Markov Model enables us to perform automatic segmentation of the digital data stream and allows us to extract meaningful metrics that correlate the Trail Making Tests performance to other cognitive and physical function test scores. We show that the extracted metrics provide information in addition to the traditionally used scores. / 2023-10-06T00:00:00Z
18

Biological network models for inferring mechanism of action, characterizing cellular phenotypes, and predicting drug response

Griffin, Paula Jean 13 February 2016 (has links)
A primary challenge in the analysis of high-throughput biological data is the abundance of correlated variables. A small change to a gene's expression or a protein's binding availability can cause significant downstream effects. The existence of such chain reactions presents challenges in numerous areas of analysis. By leveraging knowledge of the network interactions that underlie this type of data, we can often enable better understanding of biological phenomena. This dissertation will examine network-based statistical approaches to the problems of mechanism-of-action inference, characterization of gene expression changes, and prediction of drug response. First, we develop a method for multi-target perturbation detection in multi-omics biological data. We estimate a joint Gaussian graphical model across multiple data types using penalized regression, and filter for network effects. Next, we apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. We also present a conditional testing procedure to allow for detection of secondary perturbations. Second, we address the problem of characterization of cellular phenotypes via Bayesian regression in the Gene Ontology (GO). In our model, we use the structure of the GO to assign changes in gene expression to functional groups, and to model the covariance between these groups. In addition to describing changes in expression, we use these functional activity estimates to predict the expression of unobserved genes. We further determine when such predictions are likely to be inaccurate by identifying GO terms with poor agreement to gene-level estimates. In a case study, we identify GO terms relevant to changes in the growth rate of S. cerevisiae. Lastly, we consider the prediction of drug sensitivity in cancer cell lines based on pathway-level activity estimates from ASSIGN, a Bayesian factor analysis model. We use penalized regression to predict response to various cancer treatments based on cancer subtype, pathway activity, and 2-way interactions thereof. We also present network representations of these interaction models and examine common patterns in their structure across treatments.
19

Extensions to Bayesian generalized linear mixed effects models for household tuberculosis transmission

McIntosh, Avery Isaac 12 May 2017 (has links)
Understanding tuberculosis transmission is vital for efforts at interrupting the spread of disease. Household contact studies that follow persons sharing a household with a TB case—so-called household contacts—and test for latent TB infection by tuberculin skin test conversion give investigators vital information about risk factors for TB transmission. In these studies, investigators often assume secondary cases are infected by the primary TB case, despite substantial evidence that infection from a source outside the home is often equally likely, especially in high-prevalence settings. Investigators may discard information on contacts who test positive at study initiation due to uncertainty of the infection source, or assume infected contacts were infected from the index case prior to study initiation. With either assumption, information on transmission dynamics is lost or incomplete, and estimates of household risk factors for transmission will be biased. This dissertation describes an approach to modeling TB transmission that accounts for community-acquired transmission in the estimation of transmission risk factors from household contact study data. The proposed model generates population-specific estimates of the probability a contact of an infectious case will be infected from a source outside the home—a vital statistic for planning effective interventions to halt disease spread—in additional to estimates of household transmission predictors. We first describe the model analytically, and then apply it to synthetic datasets under different risk scenarios. We then fit the model to data taken from three household contact studies in different locations: Brazil, India, and Uganda. Infection predictors such as contact sleeping proximity to the index case and index case disease severity are underestimated in standard models compared to the proposed method, and non-household TB infection risk increases with age stratum, reflecting longer at-risk duration for community-based exposure for older contacts. This analysis will aid public health planners in understanding how best to interrupt TB spread in disparate populations by characterizing where transmission risk is greatest and which risk factors influence household-acquired transmission. Finally, we present an open-source software package in the R environment titled upmfit for modular implementation of the Bayesian Markov Chain Monte Carlo methods used to estimate the model. / 2018-05-10T00:00:00Z
20

Effects of Intercropping Switchgrass in Loblolly Pine Plantations on Bird Communities

Loman, Zachary G 13 December 2014 (has links)
Intercropping switchgrass (Panicum virgatum) between tree rows within young pine (Pinus spp.) plantations is a novel method to generate lignocellulosic biofuel feedstocks within intensively managed forests. Intensively managed pine supports diverse avian assemblages potentially affected by establishment and maintenance of a biomass feedstock. I sought to understand how establishing switchgrass on an operational scale affects bird communities within intercropped plantations as compared to typical intensively managed loblolly pine (Pinus taeda) plantations. I conducted breeding bird point counts, nest searching and monitoring, and coarse woody debris (CWD) surveys following establishment of intercropped switchgrass stands (6 replicates), traditionally-managed pine plantations, and switchgrass-only plots (0.1 km2 minimum) in Kemper Co., MS from 2011 to 2013. I found establishment of intercropping did not affect downed CWD, but reduced standing snags and green trees. I detected 59 breeding bird species from 11,195 detections and modeled nest survivorship for 17 species. Neotropical migrants and forest-edge associated species were less abundant in intercropped plots than controls for two years after establishment, and more abundant in year three. Short distance migrants and residents were scarce in intercropped and control plots initially, and did not differ between these treatments in any year. Species associated with pine-grass habitat structure were less abundant initially in intercropped plots, but converged with pine controls in subsequent years. Switchgrass monocultures provided minimal resources for birds. There was no evidence supporting an effect of intercropping on songbird nest survivorship. I found evidence for dominance of one species, yellow-breasted chat (Icteria virens), over another, indigo bunting (Passerina cyanea) in competition for nest sites, which illustrates how songbirds competing for nest sites can coexist in sympatry without the dominant species driving subordinate competitors to local extirpation. This dissertation, and related publications, are among the earliest research on wildlife response to intercropping. Forest managers implementing intercropping within pine plantations where vertebrate conservation is a management priority should be aware of potential changes to snag-utilizing species from reductions in green trees and snags. Songbird populations may lag behind traditional management for up to two years following establishment of switchgrass. Intercropping neither positively nor negatively affected songbird nest survival.

Page generated in 0.0791 seconds