Spelling suggestions: "subject:"estatistics"" "subject:"bistatistics""
11 |
Comparing the Structural Components Variance Estimator and U-Statistics Variance Estimator When Assessing the Difference Between Correlated AUCs with Finite SamplesBosse, Anna L 01 January 2017 (has links)
Introduction: The structural components variance estimator proposed by DeLong et al. (1988) is a popular approach used when comparing two correlated AUCs. However, this variance estimator is biased and could be problematic with small sample sizes.
Methods: A U-statistics based variance estimator approach is presented and compared with the structural components variance estimator through a large-scale simulation study under different finite-sample size configurations.
Results: The U-statistics variance estimator was unbiased for the true variance of the difference between correlated AUCs regardless of the sample size and had lower RMSE than the structural components variance estimator, providing better type 1 error control and larger power. The structural components variance estimator provided increasingly biased variance estimates as the correlation between biomarkers increased.
Discussion: When comparing two correlated AUCs, it is recommended that the U-Statistics variance estimator be used whenever possible, especially for finite sample sizes and highly correlated biomarkers.
|
12 |
Jackknife Emperical Likelihood Method and its ApplicationsYang, Hanfang 01 August 2012 (has links)
In this dissertation, we investigate jackknife empirical likelihood methods motivated by recent statistics research and other related fields. Computational intensity of empirical likelihood can be significantly reduced by using jackknife empirical likelihood methods without losing computational accuracy and stability. We demonstrate that proposed jackknife empirical likelihood methods are able to handle several challenging and open problems in terms of elegant asymptotic properties and accurate simulation result in finite samples. These interesting problems include ROC curves with missing data, the difference of two ROC curves in two dimensional correlated data, a novel inference for the partial AUC and the difference of two quantiles with one or two samples. In addition, empirical likelihood methodology can be successfully applied to the linear transformation model using adjusted estimation equations. The comprehensive simulation studies on coverage probabilities and average lengths for those topics demonstrate the proposed jackknife empirical likelihood methods have a good performance in finite samples under various settings. Moreover, some related and attractive real problems are studied to support our conclusions. In the end, we provide an extensive discussion about some interesting and feasible ideas based on our jackknife EL procedures for future studies.
|
13 |
Almost sure behavior for increments of U-statistics / Beschreibung der Fluktuation von Zuwächsen für U-StatistikenAbujarad, Mohammed 18 January 2007 (has links)
No description available.
|
14 |
Tests d’hypothèses statistiquement et algorithmiquement efficaces de similarité et de dépendance / Statistically and computationally efficient hypothesis tests for similarity and dependencyBounliphone, Wacha 30 January 2017 (has links)
Cette thèse présente de nouveaux tests d’hypothèses statistiques efficaces pour la relative similarité et dépendance, et l’estimation de la matrice de précision. La principale méthodologie adoptée dans cette thèse est la classe des estimateurs U-statistiques.Le premier test statistique porte sur les tests de relative similarité appliqués au problème de la sélection de modèles. Les modèles génératifs probabilistes fournissent un cadre puissant pour représenter les données. La sélection de modèles dans ce contexte génératif peut être difficile. Pour résoudre ce problème, nous proposons un nouveau test d’hypothèse non paramétrique de relative similarité et testons si un premier modèle candidat génère un échantillon de données significativement plus proche d’un ensemble de validation de référence.La deuxième test d’hypothèse statistique non paramétrique est pour la relative dépendance. En présence de dépendances multiples, les méthodes existantes ne répondent qu’indirectement à la question de la relative dépendance. Or, savoir si une dépendance est plus forte qu’une autre est important pour la prise de décision. Nous présentons un test statistique qui détermine si une variable dépend beaucoup plus d’une première variable cible ou d’une seconde variable.Enfin, une nouvelle méthode de découverte de structure dans un modèle graphique est proposée. En partant du fait que les zéros d’une matrice de précision représentent les indépendances conditionnelles, nous développons un nouveau test statistique qui estime une borne pour une entrée de la matrice de précision. Les méthodes existantes de découverte de structure font généralement des hypothèses restrictives de distributions gaussiennes ou parcimonieuses qui ne correspondent pas forcément à l’étude de données réelles. Nous introduisons ici un nouveau test utilisant les propriétés des U-statistics appliqués à la matrice de covariance, et en déduisons une borne sur la matrice de précision. / The dissertation presents novel statistically and computationally efficient hypothesis tests for relative similarity and dependency, and precision matrix estimation. The key methodology adopted in this thesis is the class of U-statistic estimators. The class of U-statistics results in a minimum-variance unbiased estimation of a parameter.The first part of the thesis focuses on relative similarity tests applied to the problem of model selection. Probabilistic generative models provide a powerful framework for representing data. Model selection in this generative setting can be challenging. To address this issue, we provide a novel non-parametric hypothesis test of relative similarity and test whether a first candidate model generates a data sample significantly closer to a reference validation set.Subsequently, the second part of the thesis focuses on developing a novel non-parametric statistical hypothesis test for relative dependency. Tests of dependence are important tools in statistical analysis, and several canonical tests for the existence of dependence have been developed in the literature. However, the question of whether there exist dependencies is secondary. The determination of whether one dependence is stronger than another is frequently necessary for decision making. We present a statistical test which determine whether one variables is significantly more dependent on a first target variable or a second.Finally, a novel method for structure discovery in a graphical model is proposed. Making use of a result that zeros of a precision matrix can encode conditional independencies, we develop a test that estimates and bounds an entry of the precision matrix. Methods for structure discovery in the literature typically make restrictive distributional (e.g. Gaussian) or sparsity assumptions that may not apply to a data sample of interest. Consequently, we derive a new test that makes use of results for U-statistics and applies them to the covariance matrix, which then implies a bound on the precision matrix.
|
15 |
Testes de hipóteses para componentes de variância utilizando estatísticas U / U-tests for variance components in linear mixed models.Nobre, Juvencio Santos 09 August 2007 (has links)
Nós consideramos decomposições de estatísticas $U$ para obter testes para componentes de variância. As distribuições assintóticas das estatísticas de testes sob a hipótese nula são obtidas supondo apenas a existência do quarto momento do erro condicional e do segundo momento dos efeitos aleatórios. Isso permite sua utilização em uma classe bastante ampla de distribuições. Sob a suposição adicional de existência do quarto momento dos efeitos aleatórios, obtemos também a distribuição assintótica das estatísticas sob uma seqüência de hipóteses alternativas locais. Comparamos a eficiência dos testes propostos com aqueles dos testes clássicos, obtidos sob suposição de normalidade, por meio de estudos de simu-lação. Os testes propostos se mostram mais adequados nas situações em que a amostra é de tamanho moderado ou grande, independentemente da distribuição das fontes de variação, e nas situações em que existe fortes afastamentos da normalidade. / We consider decompositions of U-statistics to obtain tests for null variance components in linear mixed models. Their asymptotic distributions under the null hypothesis are obtained only assuming the existence of the first four moments of the conditional error distribution and the existence of the first two moments of the random effects distribution. Thus, the proposed U-tests may be employed in a large class of models. Under the additional assumption of the existence of the fourth moment of the distribution of the random effects, we also obtain the asymptotic distribution of the U-tests under a sequence of local hypothesis. We compare their efficiency with that of classical tests derived under the assumption of normality, through simulation studies. The proposed tests are more efficient in situations where the sample size is moderate or large, independently of the distribution of the sources of variation; they also perform better in situations where the underlying distributions are far from normal.
|
16 |
Testes de hipóteses para componentes de variância utilizando estatísticas U / U-tests for variance components in linear mixed models.Juvencio Santos Nobre 09 August 2007 (has links)
Nós consideramos decomposições de estatísticas $U$ para obter testes para componentes de variância. As distribuições assintóticas das estatísticas de testes sob a hipótese nula são obtidas supondo apenas a existência do quarto momento do erro condicional e do segundo momento dos efeitos aleatórios. Isso permite sua utilização em uma classe bastante ampla de distribuições. Sob a suposição adicional de existência do quarto momento dos efeitos aleatórios, obtemos também a distribuição assintótica das estatísticas sob uma seqüência de hipóteses alternativas locais. Comparamos a eficiência dos testes propostos com aqueles dos testes clássicos, obtidos sob suposição de normalidade, por meio de estudos de simu-lação. Os testes propostos se mostram mais adequados nas situações em que a amostra é de tamanho moderado ou grande, independentemente da distribuição das fontes de variação, e nas situações em que existe fortes afastamentos da normalidade. / We consider decompositions of U-statistics to obtain tests for null variance components in linear mixed models. Their asymptotic distributions under the null hypothesis are obtained only assuming the existence of the first four moments of the conditional error distribution and the existence of the first two moments of the random effects distribution. Thus, the proposed U-tests may be employed in a large class of models. Under the additional assumption of the existence of the fourth moment of the distribution of the random effects, we also obtain the asymptotic distribution of the U-tests under a sequence of local hypothesis. We compare their efficiency with that of classical tests derived under the assumption of normality, through simulation studies. The proposed tests are more efficient in situations where the sample size is moderate or large, independently of the distribution of the sources of variation; they also perform better in situations where the underlying distributions are far from normal.
|
17 |
Théorèmes limites pour des processus à longue mémoire saisonnièreOuld Mohamed Abdel Haye, Mohamedou 30 December 2001 (has links) (PDF)
Nous étudions le comportement asymptotique de statistiques ou fonctionnelles liées à des processus à longue mémoire saisonnière. Nous nous concentrons sur les lignes de Donsker et sur le processus empirique. Les suites considérées sont de la forme $G(X_n)$ où $(X_n)$ est un processus gaussien ou linéaire. Nous montrons que les résultats que Taqqu et Dobrushin ont obtenus pour des processus à longue mémoire dont la covariance est à variation régulière à l'infini peuvent être en défaut en présence d'effets saisonniers. Les différences portent aussi bien sur le coefficient de normalisation que sur la nature du processus limite. Notamment nous montrons que la limite du processus empirique bi-indexé, bien que restant dégénérée, n'est plus déterminée par le degré de Hermite de la fonction de répartition des données. En particulier, lorsque ce degré est égal à 1, la limite n'est plus nécessairement gaussienne. Par exemple on peut obtenir une combinaison de processus de Rosenblatt indépendants. Ces résultats sont appliqués à quelques problèmes statistiques comme le comportement asymptotique des U-statistiques, l'estimation de la densité et la détection de rupture.
|
18 |
General conditional linear models with time-dependent coefficients under censoring and truncationTeodorescu, Bianca 19 December 2008 (has links)
In survival analysis interest often lies in the relationship between the survival function and a certain number of covariates. It usually happens that for some individuals we cannot observe the event of interest, due to the presence of right censoring and/or left truncation. A typical example is given by a retrospective medical study, in which one is interested in the time interval between birth and death due to a certain disease. Patients who die of the disease at early age will rarely have entered the study before death and are therefore left truncated. On the other hand, for patients who are alive at the end of the study, only a lower bound of the true survival time is known and these patients are hence right censored.
In the case of censored and/or truncated responses, lots of models exist in the literature that describe the relationship between the survival function and the covariates (proportional hazards model or Cox model, log-logistic model, accelerated failure time model, additive risks model, etc.). In these models, the regression coefficients are usually supposed to be constant over time. In practice, the structure of the data might however be more complex, and it might therefore be better to consider coefficients that can vary over time. In the previous examples, certain covariates (e.g. age at diagnosis, type of surgery, extension of tumor, etc.) can have a relatively high impact on early age survival, but a lower influence at higher age. This motivated a number of authors to extend the Cox model to allow for time-dependent coefficients or consider other type of time-dependent coefficients models like the additive hazards model.
In practice it is of great use to have at hand a method to check the validity of the above mentioned models.
First we consider a very general model, which includes as special cases the above mentioned models (Cox model, additive model, log-logistic model, linear transformation models, etc.) with time-dependent coefficients and study the parameter estimation by means of a least squares approach. The response is allowed to be subject to right censoring and/or left truncation.
Secondly we propose an omnibus goodness-of-fit test that will test if the general time-dependent model considered above fits the data. A bootstrap version, to approximate the critical values of the test is also proposed.
In this dissertation, for each proposed method, the finite sample performance is evaluated in a simulation study and then applied to a real data set.
|
19 |
Some properties of measures of disagreement and disorder in paired ordinal dataHögberg, Hans January 2010 (has links)
The measures studied in this thesis were a measure of disorder, D, and a measure of the individual part of the disagreement, the measure of relative rank variance, RV, proposed by Svensson in 1993. The measure of disorder is a useful measure of order consistency in paired assessments of scales with a different number of possible values. The measure of relative rank variance is a useful measure in evaluating reliability and for evaluating change in qualitative outcome variables. In Paper I an overview of methods used in the analysis of dependent ordinal data and a comparison of the methods regarding the assumptions, specifications, applicability, and implications for use were made. In Paper II an application, and a comparison of the results of some standard models, tests, and measures to two different research problems were made. The sampling distribution of the measure of disorder was studied both analytically and by a simulation experiment in Paper III. The asymptotic normal distribution was shown by the theory of U-statistics and the simulation experiments for finite sample sizes and various amount of disorder showed that the sampling distribution was approximately normal for sample sizes of about 40 to 60 for moderate sizes of D and for smaller sample sizes for substantial sizes of D. The sampling distribution of the relative rank variance was studied in a simulation experiment in Paper IV. The simulation experiment showed that the sampling distribution was approximately normal for sample sizes of 60-100 for moderate size of RV, and for smaller sample sizes for substantial size of RV. In Paper V a procedure for inference regarding relative rank variances from two or more samples was proposed. Pair-wise comparison by jackknife technique for variance estimation and the use of normal distribution as approximation in inference for parameters in independent samples based on the results in Paper IV were demonstrated. Moreover, an application of Kruskal-Wallis test for independent samples and Friedman’s test for dependent samples were conducted. / Statistical methods for ordinal data
|
20 |
Nonparametric Statistical Inference for Entropy-type Functionals / Icke-parametrisk statistisk inferens för entropirelaterade funktionalerKällberg, David January 2013 (has links)
In this thesis, we study statistical inference for entropy, divergence, and related functionals of one or two probability distributions. Asymptotic properties of particular nonparametric estimators of such functionals are investigated. We consider estimation from both independent and dependent observations. The thesis consists of an introductory survey of the subject and some related theory and four papers (A-D). In Paper A, we consider a general class of entropy-type functionals which includes, for example, integer order Rényi entropy and certain Bregman divergences. We propose U-statistic estimators of these functionals based on the coincident or epsilon-close vector observations in the corresponding independent and identically distributed samples. We prove some asymptotic properties of the estimators such as consistency and asymptotic normality. Applications of the obtained results related to entropy maximizing distributions, stochastic databases, and image matching are discussed. In Paper B, we provide some important generalizations of the results for continuous distributions in Paper A. The consistency of the estimators is obtained under weaker density assumptions. Moreover, we introduce a class of functionals of quadratic order, including both entropy and divergence, and prove normal limit results for the corresponding estimators which are valid even for densities of low smoothness. The asymptotic properties of a divergence-based two-sample test are also derived. In Paper C, we consider estimation of the quadratic Rényi entropy and some related functionals for the marginal distribution of a stationary m-dependent sequence. We investigate asymptotic properties of the U-statistic estimators for these functionals introduced in Papers A and B when they are based on a sample from such a sequence. We prove consistency, asymptotic normality, and Poisson convergence under mild assumptions for the stationary m-dependent sequence. Applications of the results to time-series databases and entropy-based testing for dependent samples are discussed. In Paper D, we further develop the approach for estimation of quadratic functionals with m-dependent observations introduced in Paper C. We consider quadratic functionals for one or two distributions. The consistency and rate of convergence of the corresponding U-statistic estimators are obtained under weak conditions on the stationary m-dependent sequences. Additionally, we propose estimators based on incomplete U-statistics and show their consistency properties under more general assumptions.
|
Page generated in 0.0742 seconds