Global ETD Search

141	Novel applications for hierarchical natural move Monte Carlo simulations : from proteins to nucleic acids Demharter, Samuel January 2016 (has links) Biological molecules often undergo large structural changes to perform their function. Computational methods can provide a fine-grained description at the atomistic scale. Without sufficient approximations to accelerate the simulations, however, the time-scale on which functional motions often occur is out of reach for many traditional methods. Natural Move Monte Carlo belongs to a class of methods that were introduced to bridge this gap. I present three novel applications for Natural Move Monte Carlo, two on proteins and one on DNA epigenetics. In the second part of this thesis I introduce a new protocol for the testing of hypotheses regarding the functional motions of biological systems, named customised Natural Move Monte Carlo. Two different case studies are presented aimed at demonstrating the feasibility of customised Natural Move Monte Carlo.
142	Essays on regime switching and DSGE models with applications to U.S. business cycle Zhuo, Fan 09 November 2016 (has links) This dissertation studies various issues related to regime switching and DSGE models. The methods developed are used to study U.S. business cycles. Chapter one considers and derives the limit distributions of likelihood ratio based tests for Markov regime switching in multiple parameters in the context of a general class of nonlinear models. The analysis simultaneously addresses three difficulties: (1) some nuisance parameters are unidentified under the null hypothesis, (2) the null hypothesis yields a local optimum, and (3) the conditional regime probabilities follow stochastic processes that can only be represented recursively. When applied to US quarterly real GDP growth rates, the tests suggest strong evidence favoring the regime switching specification over a range of sample periods. Chapter two develops a modified likelihood ratio (MLR) test to detect regime switching in state space models. I apply the filtering algorithm introduced in Gordon and Smith (1988) to construct a modified likelihood function under the alternative hypothesis of two regimes and I extend the analysis in Chapter one to establish the asymptotic distribution of the MLR statistic under the null hypothesis of a single regime. I also apply the test to a simple model of the U.S. unemployment rate. This contribution is the first to develop a test based on the likelihood ratio principle to detect regime switching in state space models. The final chapter estimates a search and matching model of the aggregate labor market with sticky price and staggered wage negotiation. It starts with a partial equilibrium search and matching model and expands into a general equilibrium model with sticky price and staggered wage. I study the quantitative implications of the model. The results show that (1) the price stickiness and staggered wage structure are quantitatively important for the search and matching model of the aggregate labor market; (2) relatively high outside option payments to the workers, such as unemployment insurance payments, are needed to match the data; and (3) workers have lower bargaining power relative to firms, which contrasts with the assumption in the literature that workers and firms share equally the surplus generated from their employment relationship. Economics Hypothesis testing Likelihood ratio Markov switching Nonlinearity Search and matching State space model
143	Testes de hipoteses para dados funcionais baseados em distancias : um estudo usando splines / Distances approach to test hypothesis for functional data Souza, Camila Pedroso Estevam de 25 April 2008 (has links) Orientador: Ronaldo Dias / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-10T22:55:48Z (GMT). No. of bitstreams: 1 Souza_CamilaPedrosoEstevamde_M.pdf: 4239065 bytes, checksum: 099f19df22c0b40a411d07eacc2fe0d1 (MD5) Previous issue date: 2008 / Resumo: Avanços na tecnologia moderna têm facilitado a coleta e análise de dados de alta dimensão, ou dados que são formados por medidas repetidas de um mesmo objeto. Quando os dados são registrados densamente ao longo do tempo, freqüentemente por máquinas, eles são tipicamente chamados de dados funcionais, com uma curva (ou função) observada por objeto em estudo. A análise estatística de uma amostra de n curvas como essas é comumente chamada de análise de dados funcionais, ou ADF. Conceitualmente, dados funcionais são continuamente definidos. Claro que na prática eles geralmente são observados em pontos discretos. Não há exigência para que os dados sejam suaves, mas freqüentemente a suavidade ou outra regularidade será um aspecto chave da análise, em alguns casos derivadas das funções observadas serão importantes. Nessa dissertação diferentes técnicas de suavização serão apresentadas e discutidas, principalmente aquelas baseadas em funções splines...Observação: O resumo, na íntegra, poderá ser visualizado no texto completo da tese digital / Abstract: Advances in modern technology have facilitated the collection and analysis of high-dimensional data, or data that are repeated measurements of the same subject. When the data are recorded densely over time, often by machine, they are typically termed functional or curve data, with one observed curve (or function) per subject. The statistical analysis of a sample of n such curves is commonly termed functional data analysis, or FDA. Conceptually, functional data are continuously defined. Of course, in practice they are usually observed at discrete points. There is no general requirement that the data be smooth, but often smoothness or other regularity will be a key aspect of the analysis, in some cases derivatives of the observed functions will be important. In this project different smooth techniques are presented and discussed, mainly those based on splines functions...Note: The complete abstract is available with the full electronic digital thesis or dissertations / Mestrado / Estatistica Não Parametrica / Mestre em Estatística Suavização (Estatistica) Testes de hipóteses estatísticas Estatística não paramétrica Smoothing (Statistics) Statistical hypothesis testing Nonparametric statistics
144	Monotonicidade em testes de hipóteses / Monotonicity in hypothesis tests Gustavo Miranda da Silva 09 March 2010 (has links) A maioria dos textos na literatura de testes de hipóteses trata de critérios de otimalidade para um determinado problema de decisão. No entanto, existem, em menor quantidade, alguns textos sobre os problemas de se realizar testes de hipóteses simultâneos e sobre a concordância lógica de suas soluções ótimas. Algo que se espera de testes de hipóteses simultâneos e que, se uma hipótese H1 implica uma hipótese H0, então é desejável que a rejeição da hipótese H0 necessariamente implique na rejeição da hipótese H1, para uma mesma amostra observada. Essa propriedade é chamada aqui de monotonicidade. A fim de estudar essa propriedade sob um ponto de vista mais geral, neste trabalho é definida a nocão de classe de testes de hipóteses, que estende a funcão de teste para uma sigma-álgebra de possíveis hipóteses nulas, e introduzida uma definição de monotonicidade. Também é mostrado, por meio de alguns exemplos simples, que, para um nível de signicância fixado, a classe de testes Razão de Verossimilhanças Generalizada (RVG) não apresenta monotonicidade, ao contrário de testes formulados sob a perspectiva bayesiana, como o teste de Bayes baseado em probabilidades a posteriori, o teste de Lindley e o FBST. Porém, são verificadas, sob a teoria da decisão, quando possível, quais as condições suficientes para que uma classe de testes de hipóteses tenha monotonicidade. / Most of the texts in the literature of hypothesis testing deal with optimality criteria for a single decision problem. However, there are, to a lesser extent, texts on the problem of simultaneous hypothesis testing and the logical consistency of the optimal solutions of such procedures. For instance, the following property should be observed in simultaneous hypothesis testing: if a hypothesis H implies a hypothesis H0, then, on the basis of the same sample observation, the rejection of the hypothesis H0 necessarily should imply the rejection of the hypothesis H. Here, this property is called monotonicity. To investigate this property under a more general point of view, in this work, it is dened rst the notion of a class of hypothesis testing, which extends the test function to a sigma-eld of possible null hypotheses, and then the concept of monotonicity is introduced properly. It is also shown, through some simple examples, that for a xed signicance level, the class of Generalized Likelihood Ratio tests (GLR) does not meet monotonicity, as opposed to tests developed under the Bayesian perspective, such as Bayes tests based on posterior probabilities, Lindleys tests and Full Bayesian Signicance Tests (FBST). Finally, sucient conditions for a class of hypothesis testing to have monotonicity are determined, when possible, under a decision-theoretic approach. classes de testes de hipóteses monotonicidade teoria da decisão testes de Bayes Bayes test class of hypothesis testing decision theory monotonicity
145	The application of frequency domain methods to two statistical problems Potgieter, Gert Diedericks Johannes 10 September 2012 (has links) D.Phil. / We propose solutions to two statistical problems using the frequency domain approach to time series analysis. In both problems the data at hand can be described by the well known signal plus noise model. The first problem addressed is the estimation of the underlying variance of a process for the use in a Shewhart or CUSUM control chart when the mean of the process may be changing. We propose an estimator for the underlying variance based on the periodogram of the observed data. Such estimators have properties which make them superior to some estimators currently used in Statistical Quality Control. We also present a CUSUM chart for monitoring the variance which is based upon the periodogram-based estimator for the variance. The second problem, stimulated by a specific problem in Variable Star Astronomy, is to test whether or not the mean of a bivariate time series is constant over the span of observations. We consider two periodogram-based tests for constancy of the mean, derive their asymptotic distributions under the null hypothesis and under local alternatives and show how consistent estimators for the unknown parameters in the proposed model can be found CUSUM technique Monte Carlo method Statistical astronomy Variable stars - Observations
146	More accurate two sample comparisons for skewed populations Tong, Bo January 1900 (has links) Doctor of Philosophy / Department of Statistics / Haiyan Wang / Various tests have been created to compare the means of two populations in many scenarios and applications. The two-sample t-test, Wilcoxon Rank-Sum Test and bootstrap-t test are commonly used methods. However, methods for skewed two-sample data set are not well studied. In this dissertation, several existing two sample tests were evaluated and four new tests were proposed to improve the test accuracy under moderate sample size and high population skewness. The proposed work starts with derivation of a first order Edgeworth expansion for the test statistic of the two sample t-test. Using this result, new two-sample tests based on Cornish Fisher expansion (TCF tests) were created for both cases of common variance and unequal variances. These tests can account for population skewness and give more accurate test results. We also developed three new tests based on three transformations (T[subscript i] test, i = 1; 2; 3) for the pooled case, which can be used to eliminate the skewness of the studentized statistic. In this dissertation, some theoretical properties of the newly proposed tests are presented. In particular, we derived the order of type I error rate accuracy of the pooled two-sample t-test based on normal approximation (TN test), the TCF and T[subscript i] tests. We proved that these tests give the same theoretical type I error rate under skewness. In addition, we derived the power function of the TCF and TN tests as a function of the population parameters. We also provided the detailed conditions under which the theoretical power of the two-sample TCF test is higher than the two-sample TN test. Results from extensive simulation studies and real data analysis were also presented in this dissertation. The empirical results further confirm our theoretical results. Comparing with commonly used two-sample parametric and nonparametric tests, our new tests (TCF and Ti) provide the same empirical type I error rate but higher power. Edgeworth expansion Power and sample size Calculation Hypothesis testing Nonparametric test for skewed population
147	Maximization of power in randomized clinical trials using the minimization treatment allocation technique Marange, Chioneso Show January 2010 (has links) Generally the primary goal of randomized clinical trials (RCT) is to make comparisons among two or more treatments hence clinical investigators require the most appropriate treatment allocation procedure to yield reliable results regardless of whether the ultimate data suggest a clinically important difference between the treatments being studied. Although recommended by many researchers, the utilization of minimization has been seldom reported in randomized trials mainly because of the controversy surrounding the statistical efficiency in detecting treatment effect and its complexity in implementation. Methods: A SAS simulation code was designed for allocating patients into two different treatment groups. Categorical prognostic factors were used together with multi-level response variables and demonstration of how simulation of data can help to determine the power of the minimization technique was carried out using ordinal logistic regression models. Results: Several scenarios were simulated in this study. Within the selected scenarios, increasing the sample size significantly increased the power of detecting the treatment effect. This was contrary to the case when the probability of allocation was decreased. Power did not change when the probability of allocation given that the treatment groups are balanced was increased. The probability of allocation { } k P was seen to be the only one with a significant effect on treatment balance. Conclusion: Maximum power can be achieved with a sample of size 300 although a small sample of size 200 can be adequate to attain at least 80% power. In order to have maximum power, the probability of allocation should be fixed at 0.75 and set to 0.5 if the treatment groups are equally balanced. Clinical trials -- Statistical methods Statistical hypothesis testing Regression analysis Logistic distribution Estimation theory
148	Hypothesis testing and feature selection in semi-supervised data Sechidis, Konstantinos January 2015 (has links) A characteristic of most real world problems is that collecting unlabelled examples is easier and cheaper than collecting labelled ones. As a result, learning from partially labelled data is a crucial and demanding area of machine learning, and extending techniques from fully to partially supervised scenarios is a challenging problem. Our work focuses on two types of partially labelled data that can occur in binary problems: semi-supervised data, where the labelled set contains both positive and negative examples, and positive-unlabelled data, a more restricted version of partial supervision where the labelled set consists of only positive examples. In both settings, it is very important to explore a large number of features in order to derive useful and interpretable information about our classification task, and select a subset of features that contains most of the useful information. In this thesis, we address three fundamental and tightly coupled questions concerning feature selection in partially labelled data; all three relate to the highly controversial issue of when does additional unlabelled data improve performance in partially labelled learning environments and when does not. The first question is what are the properties of statistical hypothesis testing in such data? Second, given the widespread criticism of significance testing, what can we do in terms of effect size estimation, that is, quantification of how strong the dependency between feature X and the partially observed label Y? Finally, in the context of feature selection, how well can features be ranked by estimated measures, when the population values are unknown? The answers to these questions provide a comprehensive picture of feature selection in partially labelled data. Interesting applications include for estimation of mutual information quantities, structure learning in Bayesian networks, and investigation of how human-provided prior knowledge can overcome the restrictions of partial labelling. One direct contribution of our work is to enable valid statistical hypothesis testing and estimation in positive-unlabelled data. Focusing on a generalised likelihood ratio test and on estimating mutual information, we provide five key contributions. (1) We prove that assuming all unlabelled examples are negative cases is sufficient for independence testing, but not for power analysis activities. (2) We suggest a new methodology that compensates this and enables power analysis, allowing sample size determination for observing an effect with a desired power by incorporating user’s prior knowledge over the prevalence of positive examples. (3) We show a new capability, supervision determination, which can determine a-priori the number of labelled examples the user must collect before being able to observe a desired statistical effect. (4) We derive an estimator of the mutual information in positive-unlabelled data, and its asymptotic distribution. (5) Finally, we show how to rank features with and without prior knowledge. Also we derive extensions of these results to semi-supervised data. In another extension, we investigate how we can use our results for Markov blanket discovery in partially labelled data. While there are many different algorithms for deriving the Markov blanket of fully supervised nodes, the partially labelled problem is far more challenging, and there is a lack of principled approaches in the literature. Our work constitutes a generalization of the conditional tests of independence for partially labelled binary target variables, which can handle the two main partially labelled scenarios: positive-unlabelled and semi-supervised. The result is a significantly deeper understanding of how to control false negative errors in Markov Blanket discovery procedures and how unlabelled data can help. Finally, we present how our results can be used for information theoretic feature selection in partially labelled data. Our work extends naturally feature selection criteria suggested for fully-supervised data, to partially labelled scenarios. These criteria can capture both the relevancy and redundancy of the features and can be used for semi-supervised and positive-unlabelled data. 519.5
149	Statistické zpracování dat z reálného výrobního procesu / Statistical analysis of real manufacturing process data Kučerová, Barbora January 2012 (has links) Tématem této diplomové práce je statistická regulace výrobního procesu. Cílem bylo analyzovat data z reálného technologického procesu revolverového vstřikovacího lisu. Analýza byla provedena za užití statistického testování hypotéz, analýzy rozptylu, obecného lineárního modelu a analýzy způsobilosti procesu. Analýza dat byla provedena ve statistickém softwaru Minitab 16.
150	Employing mHealth Applications for the Self-Assessment of Selected Eye Functions and Prediction of Chronic Major Eye Diseases among the Aging Population Abdualiyeva, Gulnara 24 May 2019 (has links) In the epoch of advanced mHealth (mobile health) use in ophthalmology, there is a scientific call for regulating the validity and reliability of eye-related apps. For a positive health outcome that works towards enhancing mobile-application guided diagnosis in joint decision-making between eye specialists and individuals, the aging population should be provided with a reliable and valid tool for assessment of their eye status outside the physician office. This interdisciplinary study aims to determine through hypothesis testing validity and reliability of a limited set of five mHealth apps (mHAs ) and through binary logistic regression the prediction possibilities of investigated apps to exclude the four major eye diseases in the particular demographic population. The study showed that 189 aging adults (45- 86 years old) who did complete the mHAs’ tests were able to produce reliable results of selected eye function tests through four out of five mHAs measuring visual acuity, contrast sensitivity, red desaturation, visual field and Amsler grid in comparison with a “gold standard” - comprehensive eye examination. Also, part of the participants was surveyed for assessing the Quality of Experience on mobile apps. Understanding of current reliability of existing eye-related mHAs will lead to the creation of ideal mobile application’ self-assessment protocol predicting the timely need for clinical assessment and treatment of age-related macular degeneration, diabetic retinopathy, glaucoma and cataract. Detecting the level of eye function impairments by mHAs is cost-effective and can contribute to research methodology in eye diseases’ prediction by expanding the system of clear criteria specially created for mobile applications and provide returning significant value in preventive ophthalmology. mHealth applications Reliability Validity Age-related eye diseases Hypothesis testing Binary logistic regression

Search results