Global ETD Search

1	On two-sample data analysis by exponential model Choi, Sujung 01 November 2005 (has links) We discuss two-sample problems and the implementation of a new two-sample data analysis procedure. The proposed procedure is based on the concepts of mid-distribution, design of score functions, components, comparison distribution, comparison density and exponential model. Assume that we have a random sample X1, . . . ,Xm from a continuous distribution F(y) = P(Xi y), i = 1, . . . ,m and a random sample Y1, . . . ,Yn from a continuous distribution G(y) = P(Yi y), i = 1, . . . ,n. Also assume independence of the two samples. The two-sample problem tests homogeneity of two samples and formally can be stated as H0 : F = G. To solve the two-sample problem, a number of tests have been proposed by statisticians in various contexts. Two typical tests are the two-sample t?test and the Wilcoxon's rank sum test. However, since they are testing differences in locations, they do not extract more information from the data as well as a test of the homogeneity of the distribution functions. Even though the Kolmogorov-Smirnov test statistic or Anderson-Darling tests can be used for the test of H0 : F = G, those statistics give no indication of the actual relation of F to G when H0 : F = G is rejected. Our goal is to learn why it was rejected. Our approach gives an answer using graphical tools which is a main property of our approach. Our approach is functional in the sense that the parameters to be estimated are probability density functions. Compared with other statistical tools for two-sample problems such as the t-test or the Wilcoxon rank-sum test, density estimation makes us understand the data more fully, which is essential in data analysis. Our approach to density estimation works with small sample sizes, too. Also our methodology makes almost no assumptions on two continuous distributions F and G. In that sense, our approach is nonparametric. Our approach gives graphical elements in two-sample problem where exist not many graphical elements typically. Furthermore, our procedure will help researchers to make a conclusion as to why two populations are different when H0 is rejected and to give an explanation to describe the relation between F and G in a graphical way. two-sample problem exponential model data analysis procedure comparison distribution function comparison density function
2	A Bayesian nonparametric approach for the two-sample problem / Uma abordagem bayesiana não paramétrica para o problema de duas amostras Console, Rafael de Carvalho Ceregatti de 19 November 2018 (has links) In this work, we discuss the so-called two-sample problem Pearson and Neyman (1930) assuming a nonparametric Bayesian approach. Considering X1; : : : ; Xn and Y1; : : : ; Ym two independent i.i.d samples generated from P1 and P2, respectively, the two-sample problem consists in deciding if P1 and P2 are equal. Assuming a nonparametric prior, we propose an evidence index for the null hypothesis H0 : P1 = P2 based on the posterior distribution of the distance d (P1; P2) between P1 and P2. This evidence index has easy computation, intuitive interpretation and can also be justified in the Bayesian decision-theoretic context. Further, in a Monte Carlo simulation study, our method presented good performance when compared with the well known Kolmogorov- Smirnov test, the Wilcoxon test as well as a recent testing procedure based on Polya tree process proposed by Holmes (HOLMES et al., 2015). Finally, we applied our method to a data set about scale measurements of three different groups of patients submitted to a questionnaire for Alzheimer\'s disease diagnostic. / Neste trabalho, discutimos o problema conhecido como problema de duas amostras Pearson and Neyman (1930) utilizando uma abordagem bayesiana não-paramétrica. Considere X1; : : : ; Xn and Y1; : : : ;Ym duas amostras independentes, geradas por P1 e P2, respectivamente, o problema de duas amostras consiste em decidir se P1 e P2 são iguais. Assumindo uma priori não-paramétrica, propomos um índice de evidência para a hipótese nula H0 : P1 = P2 baseado na distribuição a posteriori da distância d (P1; P2) entre P1 e P2. O índice de evidência é de fácil implementação, tem uma interpretação intuitiva e também pode ser justificada no contexto da teoria da decisão bayesiana. Além disso, em um estudo de simulação de Monte Carlo, nosso método apresentou bom desempenho quando comparado com o teste de Kolmogorov-Smirnov, com o teste de Wilcoxon e com o método de Holmes. Finalmente, aplicamos nosso método em um conjunto de dados sobre medidas de escala de três grupos diferentes de pacientes submetidos a um questionário para diagnóstico de doença de Alzheimer. Bayesian nonparametrics Bayesiano Não-paramétrico Dirichlet process Hypothesis testing Problema de Duas Amostras Processo de Dirichlet Teste de Hipótese Two-sample problem
3	Power Studies of Multivariate Two-Sample Tests of Comparison Siluyele, Ian John January 2007 (has links) Masters of Science / The multivariate two-sample tests provide a means to test the match between two multivariate distributions. Although many tests exist in the literature, relatively little is known about the relative power of these procedures. The studies reported in the thesis contrasts the effectiveness, in terms of power, of seven such tests with a Monte Carlo study. The relative power of the tests was investigated against location, scale, and correlation alternatives. Samples were drawn from bivariate exponential, normal and uniform populations. Results from the power studies show that there is no single test which is the most powerful in all situations. The use of particular test statistics is recommended for specific alternatives. A possible supplementary non-parametric graphical procedure, such as the Depth-Depth plot, can be recommended for diagnosing possible differences between the multivariate samples, if the null hypothesis is rejected. As an example of the utility of the procedures for real data, the multivariate two-sample tests were applied to photometric data of twenty galactic globular clusters. The results from the analyses support the recommendations associated with specific test statistics. Multivariate two-sample problem Non-parametric tests Multivariate two-sample test Permutation method Data depth Euclidean distance Interpoint distance distributions Nearest neighbour tests Power
4	Nonparametric Statistical Inference for Entropy-type Functionals / Icke-parametrisk statistisk inferens för entropirelaterade funktionaler Källberg, David January 2013 (has links) In this thesis, we study statistical inference for entropy, divergence, and related functionals of one or two probability distributions. Asymptotic properties of particular nonparametric estimators of such functionals are investigated. We consider estimation from both independent and dependent observations. The thesis consists of an introductory survey of the subject and some related theory and four papers (A-D). In Paper A, we consider a general class of entropy-type functionals which includes, for example, integer order Rényi entropy and certain Bregman divergences. We propose U-statistic estimators of these functionals based on the coincident or epsilon-close vector observations in the corresponding independent and identically distributed samples. We prove some asymptotic properties of the estimators such as consistency and asymptotic normality. Applications of the obtained results related to entropy maximizing distributions, stochastic databases, and image matching are discussed. In Paper B, we provide some important generalizations of the results for continuous distributions in Paper A. The consistency of the estimators is obtained under weaker density assumptions. Moreover, we introduce a class of functionals of quadratic order, including both entropy and divergence, and prove normal limit results for the corresponding estimators which are valid even for densities of low smoothness. The asymptotic properties of a divergence-based two-sample test are also derived. In Paper C, we consider estimation of the quadratic Rényi entropy and some related functionals for the marginal distribution of a stationary m-dependent sequence. We investigate asymptotic properties of the U-statistic estimators for these functionals introduced in Papers A and B when they are based on a sample from such a sequence. We prove consistency, asymptotic normality, and Poisson convergence under mild assumptions for the stationary m-dependent sequence. Applications of the results to time-series databases and entropy-based testing for dependent samples are discussed. In Paper D, we further develop the approach for estimation of quadratic functionals with m-dependent observations introduced in Paper C. We consider quadratic functionals for one or two distributions. The consistency and rate of convergence of the corresponding U-statistic estimators are obtained under weak conditions on the stationary m-dependent sequences. Additionally, we propose estimators based on incomplete U-statistics and show their consistency properties under more general assumptions. entropy estimation Rényi entropy divergence estimation quadratic density functional U-statistics consistency asymptotic normality Poisson convergence stationary m-dependent sequence inter-point distances entropy maximizing distribution two-sample problem approximate matching
5	Estimation adaptative avec des données transformées ou incomplètes. Application à des modèles de survie / Adaptive estimation with warped or incomplete data. Application to survival analysis Chagny, Gaëlle 05 July 2013 (has links) Cette thèse présente divers problèmes d'estimation fonctionnelle adaptative par sélection d'estimateurs en projection ou à noyaux, utilisant des critères inspirés à la fois de la sélection de modèles et des méthodes de Lepski. Le point commun de nos travaux est l'utilisation de données transformées et/ou incomplètes. La première partie est consacrée à une procédure d'estimation par "déformation'', dont la pertinence est illustrée pour l'estimation des fonctions suivantes : régression additive et multiplicative, densité conditionnelle, fonction de répartition dans un modèle de censure par intervalle, risque instantané pour des données censurées à droite. Le but est de reconstruire une fonction à partir d'un échantillon de couples aléatoires (X,Y). Nous utilisons les données déformées (ф(X),Y) pour proposer des estimateurs adaptatifs, où ф est une fonction bijective que nous estimons également (par exemple la fonction de répartition de X). L'intérêt est double : d'un point de vue théorique, les estimateurs ont des propriétés d'optimalité au sens de l'oracle ; d'un point de vue pratique, ils sont explicites et numériquement stables. La seconde partie s'intéresse à un problème à deux échantillons : nous comparons les distributions de deux variables X et Xₒ au travers de la densité relative, définie comme la densité de la variable Fₒ(X) (Fₒ étant la répartition de Xₒ). Nous construisons des estimateurs adaptatifs, à partir d'un double échantillon de données, possiblement censurées. Des bornes de risque non-asymptotiques sont démontrées, et des vitesses de convergences déduites. / This thesis presents various problems of adaptive functional estimation, using projection and kernel methods, and criterions inspired both by model selection and Lepski's methods. The common point of the studied statistical setting is to deal with transformed and/or incomplete data. The first part proposes a method of estimation with a "warping" device which permits to handle the estimation of functions such as additive and multiplicative regression, conditional density, hazard rate based on randomly right-censored data, and cumulative distribution function from current-status data. The aim is to estimate a function from a sample of random variable (X,Y). We use the warped data (ф(X),Y), to propose adaptive estimators, where ф is a one-to-one function that we also estimate (e.g. the cumulative distribution function of X). The interest is twofold. From the theoretical point of view, the estimators are optimal in the oracle sense. From the practical point of view, they can be easily computed, thanks to their simple explicit expression. The second part deals with a two-sample problem : we compare the distribution of two variables X and Xₒ by studying the relative density, defined as the density of Fₒ(X) (Fₒ is the c.d.f. of Xₒ). We build adaptive estimators, from a double data-sample, possibly censored. Non-asymptotic risk bounds are proved, and convergence rates are also derived. Estimation adaptative Sélection de modèles Méthode de Lepski Bases et noyaux déformés Régression Données censurées Probème à deux échantilllons Adaptative estimation Model selection Lepski's method Warped bases and kernels Regression Censored data Two-sample problem 510

1

Page generated in 0.0657 seconds