Spelling suggestions: "subject:"istatistical codels"" "subject:"istatistical 2models""
41 |
STOCHASTIC DYNAMICS OF GENE TRANSCRIPTIONXie, Yan 01 January 2011 (has links)
Gene transcription in individual living cells is inevitably a stochastic and dynamic process. Little is known about how cells and organisms learn to balance the fidelity of transcriptional control and the stochasticity of transcription dynamics. In an effort to elucidate the contribution of environmental signals to this intricate balance, a Three State Model was recently proposed, and the transcription system was assumed to transit among three different functional states randomly.
In this work, we employ this model to demonstrate how the stochastic dynamics of gene transcription can be characterized by the three transition parameters. We compute the probability distribution of a zero transcript event and its conjugate, the distribution of the time durations in gene on or gene off periods, the transition frequency between system states, and the transcriptional bursting frequency. We also exemplify the mathematical results by the experimental data on prokaryotic and eukaryotic transcription.
The analysis reveals that no promoters will be definitely turned on to transcribe within a finite time period, no matter how strong the induction signals are applied, and how abundant the activators are available. Although stronger extrinsic signals could enhance promoter activation rate, the promoter creates an intrinsic ceiling that no signals could cross over in a finite time. Consequently, among a large population of isogenic cells, only a portion of the cells, but not the whole population, could be induced by environmental signals to express a particular gene within a finite time period. We prove that the gene on duration follows an exponential distribution, and the gene off intervals show a local maximum that is best described by assuming two sequential exponential process. The transition frequencies are determined by a system of stochastic differential equations, or equivalently, an iterative scheme of integral operators. We prove that for each positive integer n , there associates a unique time, called the peak instant, at which the nth transcript synthesis cycle since time zero proceeds most likely. These moments constitute a time series preserving the nature order of n.
|
42 |
Blind image deconvolution : nonstationary Bayesian approaches to restoring blurred photosBishop, Tom E. January 2009 (has links)
High quality digital images have become pervasive in modern scientific and everyday life — in areas from photography to astronomy, CCTV, microscopy, and medical imaging. However there are always limits to the quality of these images due to uncertainty and imprecision in the measurement systems. Modern signal processing methods offer the promise of overcoming some of these problems by postprocessing these blurred and noisy images. In this thesis, novel methods using nonstationary statistical models are developed for the removal of blurs from out of focus and other types of degraded photographic images. The work tackles the fundamental problem blind image deconvolution (BID); its goal is to restore a sharp image from a blurred observation when the blur itself is completely unknown. This is a “doubly illposed” problem — extreme lack of information must be countered by strong prior constraints about sensible types of solution. In this work, the hierarchical Bayesian methodology is used as a robust and versatile framework to impart the required prior knowledge. The thesis is arranged in two parts. In the first part, the BID problem is reviewed, along with techniques and models for its solution. Observation models are developed, with an emphasis on photographic restoration, concluding with a discussion of how these are reduced to the common linear spatially-invariant (LSI) convolutional model. Classical methods for the solution of illposed problems are summarised to provide a foundation for the main theoretical ideas that will be used under the Bayesian framework. This is followed by an indepth review and discussion of the various prior image and blur models appearing in the literature, and then their applications to solving the problem with both Bayesian and nonBayesian techniques. The second part covers novel restoration methods, making use of the theory presented in Part I. Firstly, two new nonstationary image models are presented. The first models local variance in the image, and the second extends this with locally adaptive noncausal autoregressive (AR) texture estimation and local mean components. These models allow for recovery of image details including edges and texture, whilst preserving smooth regions. Most existing methods do not model the boundary conditions correctly for deblurring of natural photographs, and a Chapter is devoted to exploring Bayesian solutions to this topic. Due to the complexity of the models used and the problem itself, there are many challenges which must be overcome for tractable inference. Using the new models, three different inference strategies are investigated: firstly using the Bayesian maximum marginalised a posteriori (MMAP) method with deterministic optimisation; proceeding with the stochastic methods of variational Bayesian (VB) distribution approximation, and simulation of the posterior distribution using the Gibbs sampler. Of these, we find the Gibbs sampler to be the most effective way to deal with a variety of different types of unknown blurs. Along the way, details are given of the numerical strategies developed to give accurate results and to accelerate performance. Finally, the thesis demonstrates state of the art results in blind restoration of synthetic and real degraded images, such as recovering details in out of focus photographs.
|
43 |
NFL Betting Market: Using Adjusted Statistics to Test Market Efficiency and Build a Betting ModelDonnelly, James P 01 January 2013 (has links)
The use of statistical analysis has been prevalent in the sports gambling industry for years. More recently, we have seen the emergence of "adjusted statistics", a more sophisticated way to examine each play and each result (further explanation below). And while adjusted statistics have become commonplace for professional and recreational bettors alike, little research has been done to justify their use. In this paper the effectiveness of this data is tested on the most heavily wagered sport in the world – the National Football League (NFL). The results are studied with two central questions in mind: Does the market account for the information provided by adjusted statistics? And, can this data be interpreted to create a profitable betting strategy? First, the Efficient Market Hypothesis is introduced and tested using these new variables. Then, a betting model is built and tested.
|
44 |
A Multi-Method Exploration of the Genetic and Environmental Risks Contributing to Tobacco Use Behaviors in Young AdulthoodDo, Elizabeth K. 01 January 2017 (has links)
Tobacco use remains the leading preventable cause of morbidity and mortality in both the United States and worldwide. Twin and family studies have demonstrated that both genetic and environmental factors are important contributors to tobacco use behaviors. Understanding how genes, the environment, and their interactions is critical to the development of public health interventions that focus on the reduction of tobacco related morbidity and mortality. However, few studies have examined the transition from adolescent to young adulthood – the time when many individuals are experimenting with and developing patterns of tobacco use. This dissertation thesis seeks to provide a comprehensive set of studies looking at risk for tobacco use behaviors and nicotine dependence using samples of young adults. The first aim is to examine the joint contributions of genetic liability and environmental contexts on tobacco use in adolescence and young adulthood using classical twin study methodologies. The second goal is to identify genetic variants and quantifying genetic risk for tobacco use in young adulthood and examining their interaction with environmental context across development. Accordingly, the thesis is divided up into the following sections: i) reviews of existing literature on genes, environment, and tobacco use; ii) twin studies of genetic and environmental influences on tobacco use behavior phenotypes; iii) prevalence, correlates, and predictors of tobacco use behaviors; iv) genetic analyses of tobacco use behaviors; v) a commentary on the emergence of alternative nicotine delivery systems and its public health impacts; and vi) plans for an internet-based educational intervention seeking to reduce tobacco use (and nicotine dependence) by providing students attending university with information on genetic and environmental risk factors for nicotine dependence.
|
45 |
Classification by Neural Network and Statistical Models in Tandem: Does Integration Enhance Performance?Mitchell, David 12 1900 (has links)
The major purposes of the current research are twofold. The first purpose is to present a composite approach to the general classification problem by using outputs from various parametric statistical procedures and neural networks. The second purpose is to compare several parametric and neural network models on a transportation planning related classification problem and five simulated classification problems.
|
46 |
Dynamic Bayesian Approaches to the Statistical Calibration ProblemRivers, Derick Lorenzo 01 January 2014 (has links)
The problem of statistical calibration of a measuring instrument can be framed both in a statistical context as well as in an engineering context. In the first, the problem is dealt with by distinguishing between the "classical" approach and the "inverse" regression approach. Both of these models are static models and are used to estimate "exact" measurements from measurements that are affected by error. In the engineering context, the variables of interest are considered to be taken at the time at which you observe the measurement. The Bayesian time series analysis method of Dynamic Linear Models (DLM) can be used to monitor the evolution of the measures, thus introducing a dynamic approach to statistical calibration. The research presented employs the use of Bayesian methodology to perform statistical calibration. The DLM framework is used to capture the time-varying parameters that may be changing or drifting over time. Dynamic based approaches to the linear, nonlinear, and multivariate calibration problem are presented in this dissertation. Simulation studies are conducted where the dynamic models are compared to some well known "static'" calibration approaches in the literature from both the frequentist and Bayesian perspectives. Applications to microwave radiometry are given.
|
47 |
The Interacting Multiple Models Algorithm with State-Dependent Value AssignmentRastgoufard, Rastin 18 May 2012 (has links)
The value of a state is a measure of its worth, so that, for example, waypoints have high value and regions inside of obstacles have very small value. We propose two methods of incorporating world information as state-dependent modifications to the interacting multiple models (IMM) algorithm, and then we use a game's player-controlled trajectories as ground truths to compare the normal IMM algorithm to versions with our proposed modifications. The two methods involve modifying the model probabilities in the update step and modifying the transition probability matrix in the mixing step based on the assigned values of different target states. The state-dependent value assignment modifications are shown experimentally to perform better than the normal IMM algorithm in both estimating the target's current state and predicting the target's next state.
|
48 |
Automated Sea State Classification from Parameterization of Survey Observations and Wave-Generated Displacement DataTeichman, Jason A 13 May 2016 (has links)
Sea state is a subjective quantity whose accuracy depends on an observer’s ability to translate local wind waves into numerical scales. It provides an analytical tool for estimating the impact of the sea on data quality and operational safety. Tasks dependent on the characteristics of local sea surface conditions often require accurate and immediate assessment. An attempt to automate sea state classification using eleven years of ship motion and sea state observation data is made using parametric modeling of distribution-based confidence and tolerance intervals and a probabilistic model using sea state frequencies. Models utilizing distribution intervals are not able to exactly convert ship motion data into various sea states scales with significant accuracy. Model averages compared to sea state tolerances do provide improved statistical accuracy but the results are limited to trend assessment. The probabilistic model provides better prediction potential than interval-based models, but is spatially and temporally dependent.
|
49 |
Modelos para teste de estresse do sistema financeiro no Brasil / Stress test models for the brazilian financial systemZaniboni, Natália Cordeiro 05 June 2018 (has links)
A literatura sobre testes de estresse do sistema financeiro vem crescendo substancialmente nos últimos anos devido à importância destes exercícios, destacada pela crise financeira do subprime, a sequência de falências bancárias em muitos países e a crise econômica brasileira. Este trabalho propõe uma metodologia para testes de estresse, focada em risco de crédito, para o sistema financeiro brasileiro. Após a definição do escopo, o segundo passo de um teste de estresse é a identificação das vulnerabilidades do sistema financeiro, em que se captura as relações entre fatores macroeconômicos e a inadimplência. A maior parte dos trabalhos utiliza um conjunto limitado de fatores macroeconômicos. Este trabalho propôs a utilização de mais de 300 variáveis e uma análise fatorial para obter fatores macroeconômicos que consideram um conjunto mais abrangente de variáveis em um modelo ARIMAX. Além disso, os trabalhos comumente empregam modelos de dados em painel, VAR, séries temporais ou modelos de regressão linear. Porém, a mudança em uma variável raramente afeta outra instantaneamente, pois o efeito é distribuído ao longo do tempo. Neste trabalho é proposto o modelo de defasagem distribuída polinomial, que considera este efeito ao estimar os parâmetros defasados por meio de um polinômio de segundo grau. Os modelos foram construídos utilizando o período de março de 2007 a agosto de 2016 como período de modelagem e setembro de 2016 a agosto de 2017 como período de validação fora do tempo. Para os meses de validação, os modelos propostos apresentaram menor soma dos quadrados dos erros. O terceiro passo é a calibração de um cenário de estresse adverso e plausível, que pode ser obtido pelos métodos histórico, hipotético e probabilístico. Nota-se uma lacuna na literatura brasileira, suprida neste trabalho, em que não há propostas de cenários hipotéticos e históricos (que consideram todas as crises de 2002, crise subprime de 2008 e crise de 2015/2017) para o Brasil. Notou-se que os choques históricos geram valores mais severos que os hipotéticos, e há variáveis mais sensíveis aos diferentes tipos de crises econômicas. Ao verificar o impacto do cenário obtido para as instituições, a inadimplência estimada em cenário de estresse foi de 6,38%, um aumento de 68% em relação ao cenário base. Este aumento foi semelhante, um pouco mais severo, aos choques obtidos na literatura brasileira e ao Relatório de Estabilidade Financeira construído pelo Banco Central do Brasil, que estima que o sistema bancário está preparado para absorver um cenário de estresse macroeconômico. / Studies on stress testing the financial system has been growing substantially recently due to the importance of these exercises, highlighted by the subprime financial crisis, the bank failures sequence in many countries and the Brazilian economic crisis. This paper proposes a stress test methodology, focused on credit risk, for the Brazilian financial system. After the scope definition, the second step of a stress test is the vulnerabilities of the financial system identification, in which the relation between macroeconomic factors and credit risk are captured. Most papers use a limited set of macroeconomic factors. This paper proposes the use of more than 300 variables and a factor analysis to obtain macroeconomic factors to consider a more comprehensive set of variables in an ARIMAX model. In addition, academic papers commonly employ panel data models, VAR, time series or linear regression models. However, changing one variable rarely affects another instantaneously because the effect is distributed over time. In this work the polynomial distributed lag model is proposed, which considers this effect when estimating the lagged parameters by a second degree polynomial. The models were constructed using March 2007 to August 2016 as a modeling period and September 2016 to August 2017 as an out of time validation period. For the validation period, the proposed models presented a smaller sum of the squares errors. The third step is the calibration of an adverse and plausible stress scenario, which can be obtained by historical, hypothetical and probabilistic methods. We note a gap in the Brazilian literature, provided in this paper, in which there are no hypothetical and historical scenarios (which consider all crises of 2002, subprime crisis of 2008 and crisis of 2015/2017) for Brazil. It was noted that historical shocks generate more severe values than hypothetical shocks, and there are variables more sensitive to different types of economic crises. When verifying the impact of the scenario obtained for the institutions, the estimated default in the stress scenario was 6.38%, an increase of 68% in relation to the base scenario. This increase was similar, somewhat more severe, to the shocks obtained in the Brazilian literature and to the Financial Stability Report built by the Central Bank of Brazil, which estimates that the banking system is prepared to absorb a macroeconomic stress scenario.
|
50 |
Padrões temporais de casos de tuberculose em municípios do estado de São Paulo, no período de 2009 a 2013 / Temporal series of tuberculosis counting in cities of Sao Paulo state in the period ranging from 2009 to 2013Angela Achcar 23 September 2016 (has links)
A tuberculose continua sendo um problema de saúde publica, apesar dos avanços científicos, da evolução do tratamento e da descoberta de vacina. O presente estudo tem como intuito descrever a distribuição temporal da doença para os casos novos notificados nos municípios com mais de 200.000 habitantes no Estado de São Paulo (centros regionais), no período de janeiro de 2009 a dezembro de 2013, além de identificar algumas covariáveis socioeconômicas, demográficas e de saúde, que podem estar associadas com a doença e utilizar estas variáveis em diferentes modelos estatísticos para serem usados na previsão mensal de novos casos reportados no SINAN (Sistema de Informação de Agravos de Notificação). Alguns modelos estatísticos especiais foram utilizados na análise dos dados: modelos de regressão linear múltipla para os dados agrupados de contagem para todas as cidades ou cidades individuais transformados para uma escala logarítmica ou para as taxas de novos casos de tuberculose para todas as cidades nos 60 meses de acompanhamento e modelos de regressão de Poisson para dados de contagem. As inferências de interesse foram obtidas sob uma abordagem clássica e bayesiana e sob diferentes estruturas para as covariáveis e efeitos aleatórios que capturam a possível dependência das contagens de tuberculose nos 60 meses de acompanhamento. As previsões obtidas pelos modelos são comparadas com os dados reais para a procura de melhores modelos para serem ajustados pelos dados. / Tuberculosis remains a public health problem, despite scientific advances, the development of treatment and vaccine discovery. This study is intended to describe the temporal distribution of the new cases of the disease reported in cities with more than 200,000 inhabitants in the state of São Paulo (regional centers), from January 2009 to December 2013, and to identify some socio-economic, demographic and health covariates that may be associated with the disease and use these variables in different statistical models for use in the monthly forecast of new cases reported in the SINAN (Diseases Information notification System). Some special statistical models were used in the data analysis: multiple linear regression models for the pooled data count for all cities or individual cities transformed to a logarithmic scale or the rates of new cases of tuberculosis for all cities in the 60 months monitoring time and Poisson regression models for count data. Inferences of interest were obtained under a classical and a Bayesian approach and under different structures for covariates and random effects that capture the possible dependence of tuberculosis counts in the 60 months follow-up. The forecasts obtained by the models are compared with the actual data for the search of better models to be fitted by the data.
|
Page generated in 0.0636 seconds