Global ETD Search

51	Statistical models of TF/DNA interaction Fouquier d'Herouel, Aymeric January 2008 (has links) <p>Gene expression is regulated in response to metabolic necessities and environmental changes throughout the life of a cell.</p><p>A major part of this regulation is governed at the level of transcription, deciding whether messengers to specific genes are produced or not.</p><p>This decision is triggered by the action of transcription factors, proteins which interact with specific sites on DNA and thus influence the rate of transcription of proximal genes.</p><p>Mapping the organisation of these transcription factor binding sites sheds light on potential causal relations between genes and is the key to establishing networks of genetic interactions, which determine how the cell adapts to external changes.</p><p>In this work I review briefly the basics of genetics and summarise popular approaches to describe transcription factor binding sites, from the most straight forward to finally discuss a biophysically motivated representation based on the estimation of free energies of molecular interactions.</p><p>Two articles on transcription factors are contained in this thesis, one published (Aurell, Fouquier d'Hérouël, Malmnäs and Vergassola, 2007) and one submitted (Fouquier d'Hérouël, 2008).</p><p>Both rely strongly on the representation of binding sites by matrices accounting for the affinity of the proteins to specific nucleotides at the different positions of the binding sites.</p><p>The importance of non-specific binding of transcription factors to DNA is briefly addressed in the text and extensively discussed in the first appended article:</p><p>In a study on the affinity of yeast transcription factors for their binding sites, we conclude that measured in vivo protein concentrations are marginally sufficient to guarantee the occupation of functional sites, as opposed to unspecific emplacements on the genomic sequence.</p><p>A common task being the inference of binding site motifs, the most common statistical method is reviewed in detail, upon which I constructed an alternative biophysically motivated approach, exemplified in the second appended article.</p> gene expression regulation transcription factor binding motif matrix representations gibbs sampling binding affinity non-specific binding Biological physics Biologisk fysik
52	Capturing patterns of spatial and temporal autocorrelation in ordered response data : a case study of land use and air quality changes in Austin, Texas Wang, Xiaokun, 1979- 05 May 2015 (has links) Many databases involve ordered discrete responses in a temporal and spatial context, including, for example, land development intensity levels, vehicle ownership, and pavement conditions. An appreciation of such behaviors requires rigorous statistical methods, recognizing spatial effects and dynamic processes. This dissertation develops a dynamic spatial ordered probit (DSOP) model in order to capture patterns of spatial and temporal autocorrelation in ordered categorical response data. This model is estimated in a Bayesian framework using Gibbs sampling and data augmentation, in order to generate all autocorrelated latent variables. The specifications, methodologies, and applications undertaken here advance the field of spatial econometrics while enhancing our understanding of land use and air quality changes. The proposed DSOP model incorporates spatial effects in an ordered probit model by allowing for inter-regional spatial interactions and heteroskedasticity, along with random effects across regions (where "region" describes any cluster of observational units). The model assumes an autoregressive, AR(1), process across latent response values, thereby recognizing time-series dynamics in panel data sets. The model code and estimation approach is first tested on simulated data sets, in order to reproduce known parameter values and provide insights into estimation performance. Root mean squared errors (RMSE) are used to evaluate the accuracy of estimates, and the deviance information criterion (DIC) is used for model comparisons. It is found that the DSOP model yields much more accurate estimates than standard, non-spatial techniques. As for model selection, even considering the penalty for using more parameters, the DSOP model is clearly preferred to standard OP, dynamic OP and spatial OP models. The model and methods are then used to analyze both land use and air quality (ozone) dynamics in Austin, Texas. In analyzing Austin's land use intensity patterns over a 4-point panel, the observational units are 300 m × 300 m grid cells derived from satellite images (at 30 m resolution). The sample contains 2,771 such grid cells, spread among 57 clusters (zip code regions), covering about 10% of the overall study area. In this analysis, temporal and spatial autocorrelation effects are found to be significantly positive. In addition, increases in travel times to the region's central business district (CBD) are estimated to substantially reduce land development intensity. The observational units for the ozone variation analysis are 4 km × 4 km grid cells, and all 132 observations falling in the study area are used. While variations in ozone concentration levels are found to exhibit strong patterns of temporal autocorrelation, they appear strikingly random in a spatial context (after controlling for local land cover, transportation, and temperature conditions). While transportation and land cover conditions appear to influence ozone levels, their effects are not as instantaneous, nor as practically significant as the impact of temperature. The proposed and tested DSOP model is felt to be a significant contribution to the field of spatial econometrics, where binary applications (for discrete response data) have been seen as the cutting edge. The Bayesian framework and Gibbs sampling techniques used here permit such complexity, in world of two-dimensional autocorrelation. / text Dynamic spatial ordered probit Spatial and temporal autocorrelation Bayesian framework Gibbs sampling Heteroskedasticity Root mean squared errors Deviance information criterion
53	Uma revisão do fator de Bayes com aplicação à modelos com misturas. Missão, Érica Cristina Marins 11 March 2004 (has links) Made available in DSpace on 2016-06-02T20:05:58Z (GMT). No. of bitstreams: 1 DissECMM.pdf: 1660938 bytes, checksum: 066c901ea835b9ef55119d64f6806e4a (MD5) Previous issue date: 2004-03-11 / Universidade Federal de Sao Carlos / O fator de Bayes é uma ferramenta utilizada na seleção de modelos. Neste trabalho fazemos uma revisão abrangente de diversos aspectos do fator de bayes. Também apresentamos as soluções disponíveis atualmente para os problemas relacionados à distribuição a priori imprópria como o fator de Bayes intrínseco e o fator de bayes fracional. São apresentados resultados de simulações com o fator de bayes sendo utilizado na seleção de modelos e uma aplicação a um conjunto de dados reais. Nestas smulações e na aplicação utilizamos o fator de Bayes e o fator de Bayes fracional. Fator de Bayes Método de Monte Carlo Gibbs sampling Modelos com misturas
54	Inferência bayesiana para teste disgnóstico. Saraiva, Karolina Felcar 05 March 2004 (has links) Made available in DSpace on 2016-06-02T20:06:00Z (GMT). No. of bitstreams: 1 DissKFS.pdf: 1200685 bytes, checksum: e7a2b3be9b5376a60441a1787b892b77 (MD5) Previous issue date: 2004-03-05 / Financiadora de Estudos e Projetos / The simpler screening tests applied to detect disease instead of the more elaborated, usually result in the risk of incorrect diagnostic. However, these tests are only useful when the risks of misclassifications are known and considered acceptably low. So, with the purpose of looking for information on the proprieties of screening tests, as well as measuring their error rates, a Bayesian procedure was formulated using a simulation technique (Gibbs Sampling with latent variables) for estimation of the parameters of interest in the absence of a gold standard. Two applications to real data have been explored. The first one refers to the detection of the infection caused by the strongyloides parasite on 162 refugees from Cambodia that arrived in Montreal, Canada, between July 1982 to February 1983, using data from serologic test and stool examination. The second one has the purpose detecting the obesity rates on males and females school pupils through the information supplied by the anthropometric Must and Cole criteria. / O uso de testes diagnósticos mais simples como substitutos dos mais elaborados para indicar a presença de doença, geralmente resulta em risco de diagnóstico incorreto. Entretanto, estes testes são úteis quando os riscos de erros de classificação são conhecidos e aceitavelmente baixos. Então, com o objetivo de obter informações das propriedades de testes diagnósticos, assim como medir suas taxas de erro, formulou-se um procedimento bayesiano utilizando uma técnica de simulação (Gibbs Sampling com variáveis latentes) para estimação dos parâmetros de interesse na ausência de um padrão ouro. Duas aplicações com dados reais foram exploradas. A primeira refere-se a detecção da infecção causada pelo parasita strongyloides em 162 refugiados do Camboja que chegaram em Montreal, Canadá, entre julho de 1982 a fevereiro de 1983, usando dados do teste sorológico e exame de fezes. A Segunda, tem por objetivo detectar as taxas de obesidade em escolares do sexo masculino e feminino, através das informações fornecidas pelos critérios antropométricos Must e Cole. Gibbs sampling Variável latente Especificidade a posteriori Sensibilidade a posteriori Prevalência a posteriori
55	Modelo com mistura de multinomiais aplicado à identificação de proteínas similares. Coimbra, Ricardo Galante 24 February 2005 (has links) Made available in DSpace on 2016-06-02T20:06:08Z (GMT). No. of bitstreams: 1 DissRGC.pdf: 2581095 bytes, checksum: 4a2f54d065969def7422a978d84a16f4 (MD5) Previous issue date: 2005-02-24 / The proteins are important molecules from the cells, whereas they take part since the construction of cell´s framing until the transmission of the genetic information between the generations. A protein can be characterized by its function and its function is determined by the sequence of amino acids that determines its structure. To determined the protein's function is important, for instance, in a research about the cure of diseases or searching for new drugs. In this research we use a bayesian statistical methodology with mixture of multinomial and latent variables to identify proteins with similar function. We use simulations to verify the performance of the statistical model for identifying the similar proteins. At the end we apply the modeling to a real data set. / As proteínas são moléculas importantes das células, pois participam desde a construção das estruturas celulares até a transmissão de informações genéticas entre gerações. Uma proteína pode ser caracterizada pela sua função, sendo que esta função é determinada pela sequência de aminoácidos que compõe a sua estrutura. Determinar a função protéica é importante quando, por exemplo, se pesquisa a cura de doenças ou se pesquisa a fabricação de novos medicamentos. Neste trabalho utilizamos uma metodologia bayesiana de inferência estatística para inferir sobre o modelo com mistura de distribuições multinomiais e variáveis latentes para identificar proteínas com funções similares. Verificamos a performance da modelagem proposta em separar em grupos as proteínas com funções similares através de simulação. Estatística - análise Mistura de distribuições Variável latente Gibbs sampling DIC Fator de Bayes
56	A Bayesian Finite Mixture Model for Network-Telecommunication Data Manikas, Vasileios January 2016 (has links) A data modeling procedure called Mixture model, is introduced beneficial to the characteristics of our data. Mixture models have been proved flexible and easy to use, a situation which can be confirmed from the majority of papers and books which have been published the last twenty years. The models are estimated using a Bayesian inference through an efficient Markov Chain Monte Carlo (MCMC) algorithm, known as Gibbs Sampling. The focus of the paper is on models for network-telecommunication lab data (not time dependent data) and on the valid predictions we can accomplish. We categorize our variables (based on their distribution) in three cases, a mixture of Normal distributions with known allocation, a mixture of Negative Binomial Distributions with known allocations and a mixture of Normal distributions with unknown allocation. Mixture Model Bayesian Inference Markov Chain Monte Carlo Gibbs Sampling Network-Telecommunication Lab Data Probability Theory and Statistics Sannolikhetsteori och statistik
57	Accuracy and variability of item parameter estimates from marginal maximum a posteriori estimation and Bayesian inference via Gibbs samplers Wu, Yi-Fang 01 August 2015 (has links) Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and variability of the item parameter estimates from the marginal maximum a posteriori estimation via an expectation-maximization algorithm (MMAP/EM) and the Markov chain Monte Carlo Gibbs sampling (MCMC/GS) approach. In the study, the various factors which have an impact on the accuracy and variability of the item parameter estimates are discussed, and then further evaluated through a large scale simulation. The factors of interest include the composition and length of tests, the distribution of underlying latent traits, the size of samples, and the prior distributions of discrimination, difficulty, and pseudo-guessing parameters. The results of the two estimation methods are compared to determine the lower limit--in terms of test length, sample size, test characteristics, and prior distributions of item parameters--at which the methods can satisfactorily recover item parameters and efficiently function in reality. For practitioners, the results help to define limits on the appropriate use of the BILOG-MG (which implements MMAP/EM) and also, to assist in deciding the utility of OpenBUGS (which carries out MCMC/GS) for item parameter estimation in practice. Gibbs sampling item parameter estimation item response theory marginal maximum A posteriori estimation Marko chain Monte Carlo Educational Psychology
58	Skill Evaluation in Women's Volleyball Florence, Lindsay Walker 11 March 2008 (has links) (PDF) The Brigham Young University Women's Volleyball Team recorded and rated all skills (pass, set, attack, etc.) and recorded rally outcomes (point for BYU, rally continues, point for opponent) for the entire 2006 home volleyball season. Only sequences of events occurring on BYU's side of the net were considered. Events followed one of these general patterns: serve-outcome, pass-set-attack-outcome, or block-dig-set-attack-outcome. These sequences of events were assumed to be first-order Markov chains where the quality of each contact depended only explicitly on the quality of the previous contact but not on contacts further removed in the sequence. We represented these sequences in an extensive matrix of transition probabilities where the elements of the matrix were the probabilities of moving from one state to another. The count matrix consisted of the number of times play moved from one transition state to another during the season. Data in the count matrix were assumed to have a multinomial distribution. A Dirichlet prior was formulated for each row of the count matrix, so posterior estimates of the transition probabilities were then available using Gibbs sampling. The different paths in the transition probability matrix were followed through the possible sequences of events at each step of the MCMC process to compute the posterior probability density that a perfect pass results in a point, a perfect set results in a point, and so forth. These posterior probability densities are used to address questions about skill performance in BYU women's volleyball. volleyball Markov chain transition matrix Markov chain Monte Carlo Gibbs sampling multinomial distribution Bayesian model Statistics and Probability
59	An Adaptive Bayesian Approach to Dose-Response Modeling Leininger, Thomas J. 04 December 2009 (has links) (PDF) Clinical drug trials are costly and time-consuming. Bayesian methods alleviate the inefficiencies in the testing process while providing user-friendly probabilistic inference and predictions from the sampled posterior distributions, saving resources, time, and money. We propose a dynamic linear model to estimate the mean response at each dose level, borrowing strength across dose levels. Our model permits nonmonotonicity of the dose-response relationship, facilitating precise modeling of a wider array of dose-response relationships (including the possibility of toxicity). In addition, we incorporate an adaptive approach to the design of the clinical trial, which allows for interim decisions and assignment to doses based on dose-response uncertainty and dose efficacy. The interim decisions we consider are stopping early for success and stopping early for futility, allowing for patient and time savings in the drug development process. These methods complement current clinical trial design research. dynamic linear models Markov chain Monte Carlo (MCMC) Gibbs sampling Phase II clinical drug trials adaptive trial design Statistics and Probability
60	A Bayesian Hierarchical Model for Studying Inter-Occasion and Inter-Subject Variability in Pharmacokinetics Li, Xia 19 April 2011 (has links) No description available. Statistics Bayesian three-stage hierarchical model population pharmacokinetics Interoccasion variability compartment modeling piecewise absorption Gibbs sampling and Metropolis-Hasting

Search results