• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Estimating the Variance of the Sample Median

Price, Robert M., Bonett, Douglas G. 01 January 2001 (has links)
The small-sample bias and root mean squared error of several distribution-free estimators of the variance of the sample median are examined. A new estimator is proposed that is easy to compute and tends to have the smallest bias and root mean squared error.
2

Mixture Modeling and Outlier Detection in Microarray Data Analysis

George, Nysia I. 16 January 2010 (has links)
Microarray technology has become a dynamic tool in gene expression analysis because it allows for the simultaneous measurement of thousands of gene expressions. Uniqueness in experimental units and microarray data platforms, coupled with how gene expressions are obtained, make the field open for interesting research questions. In this dissertation, we present our investigations of two independent studies related to microarray data analysis. First, we study a recent platform in biology and bioinformatics that compares the quality of genetic information from exfoliated colonocytes in fecal matter with genetic material from mucosa cells within the colon. Using the intraclass correlation coe�cient (ICC) as a measure of reproducibility, we assess the reliability of density estimation obtained from preliminary analysis of fecal and mucosa data sets. Numerical findings clearly show that the distribution is comprised of two components. For measurements between 0 and 1, it is natural to assume that the data points are from a beta-mixture distribution. We explore whether ICC values should be modeled with a beta mixture or transformed first and fit with a normal mixture. We find that the use of mixture of normals in the inverse-probit transformed scale is less sensitive toward model mis-specification; otherwise a biased conclusion could be reached. By using the normal mixture approach to compare the ICC distributions of fecal and mucosa samples, we observe the quality of reproducible genes in fecal array data to be comparable with that in mucosa arrays. For microarray data, within-gene variance estimation is often challenging due to the high frequency of low replication studies. Several methodologies have been developed to strengthen variance terms by borrowing information across genes. However, even with such accommodations, variance may be initiated by the presence of outliers. For our second study, we propose a robust modification of optimal shrinkage variance estimation to improve outlier detection. In order to increase power, we suggest grouping standardized data so that information shared across genes is similar in distribution. Simulation studies and analysis of real colon cancer microarray data reveal that our methodology provides a technique which is insensitive to outliers, free of distributional assumptions, effective for small sample size, and data adaptive.
3

Calibration Adjustment for Nonresponse in Sample Surveys

Rota, Bernardo João January 2016 (has links)
In this thesis, we discuss calibration estimation in the presence of nonresponse with a focus on the linear calibration estimator and the propensity calibration estimator, along with the use of different levels of auxiliary information, that is, sample and population levels. This is a fourpapers- based thesis, two of which discuss estimation in two steps. The two-step-type estimator here suggested is an improved compromise of both the linear calibration and the propensity calibration estimators mentioned above. Assuming that the functional form of the response model is known, it is estimated in the first step using calibration approach. In the second step the linear calibration estimator is constructed replacing the design weights by products of these with the inverse of the estimated response probabilities in the first step. The first step of estimation uses sample level of auxiliary information and we demonstrate that this results in more efficient estimated response probabilities than using population-level as earlier suggested. The variance expression for the two-step estimator is derived and an estimator of this is suggested. Two other papers address the use of auxiliary variables in estimation. One of which introduces the use of principal components theory in the calibration for nonresponse adjustment and suggests a selection of components using a theory of canonical correlation. Principal components are used as a mean to accounting the problem of estimation in presence of large sets of candidate auxiliary variables. In addition to the use of auxiliary variables, the last paper also discusses the use of explicit models representing the true response behavior. Usually simple models such as logistic, probit, linear or log-linear are used for this purpose. However, given a possible complexity on the structure of the true response probability, it may raise a question whether these simple models are effective. We use an example of telephone-based survey data collection process and demonstrate that the logistic model is generally not appropriate.
4

Inférence doublement robuste en présence de données imputées dans les enquêtes

Picard, Frédéric 02 1900 (has links)
L'imputation est souvent utilisée dans les enquêtes pour traiter la non-réponse partielle. Il est bien connu que traiter les valeurs imputées comme des valeurs observées entraîne une sous-estimation importante de la variance des estimateurs ponctuels. Pour remédier à ce problème, plusieurs méthodes d'estimation de la variance ont été proposées dans la littérature, dont des méthodes adaptées de rééchantillonnage telles que le Bootstrap et le Jackknife. Nous définissons le concept de double-robustesse pour l'estimation ponctuelle et de variance sous l'approche par modèle de non-réponse et l'approche par modèle d'imputation. Nous mettons l'emphase sur l'estimation de la variance à l'aide du Jackknife qui est souvent utilisé dans la pratique. Nous étudions les propriétés de différents estimateurs de la variance à l'aide du Jackknife pour l'imputation par la régression déterministe ainsi qu'aléatoire. Nous nous penchons d'abord sur le cas de l'échantillon aléatoire simple. Les cas de l'échantillonnage stratifié et à probabilités inégales seront aussi étudiés. Une étude de simulation compare plusieurs méthodes d'estimation de variance à l'aide du Jackknife en terme de biais et de stabilité relative quand la fraction de sondage n'est pas négligeable. Finalement, nous établissons la normalité asymptotique des estimateurs imputés pour l'imputation par régression déterministe et aléatoire. / Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including appropriate versions of resampling methods such as the jackknife and the bootstrap. We define the concept of doubly robust point and variance estimation under the so-called nonresponse and imputation model approaches. We focus on jackknife variance estimation, which is widely used in practice. We study the properties of several jackknife variance estimators under both deterministic and random regression imputation. We first consider the case of simple random sampling without replacement. The case of stratified simple random sampling and unequal probability sampling is also considered. A limited simulation study compares various jackknife variance estimators in terms of bias and relative stability when the sampling fraction is not negligible. Finally, the asymptotic normality of imputed estimator is established under both deterministic and random regression imputation.
5

Inférence doublement robuste en présence de données imputées dans les enquêtes

Picard, Frédéric 02 1900 (has links)
L'imputation est souvent utilisée dans les enquêtes pour traiter la non-réponse partielle. Il est bien connu que traiter les valeurs imputées comme des valeurs observées entraîne une sous-estimation importante de la variance des estimateurs ponctuels. Pour remédier à ce problème, plusieurs méthodes d'estimation de la variance ont été proposées dans la littérature, dont des méthodes adaptées de rééchantillonnage telles que le Bootstrap et le Jackknife. Nous définissons le concept de double-robustesse pour l'estimation ponctuelle et de variance sous l'approche par modèle de non-réponse et l'approche par modèle d'imputation. Nous mettons l'emphase sur l'estimation de la variance à l'aide du Jackknife qui est souvent utilisé dans la pratique. Nous étudions les propriétés de différents estimateurs de la variance à l'aide du Jackknife pour l'imputation par la régression déterministe ainsi qu'aléatoire. Nous nous penchons d'abord sur le cas de l'échantillon aléatoire simple. Les cas de l'échantillonnage stratifié et à probabilités inégales seront aussi étudiés. Une étude de simulation compare plusieurs méthodes d'estimation de variance à l'aide du Jackknife en terme de biais et de stabilité relative quand la fraction de sondage n'est pas négligeable. Finalement, nous établissons la normalité asymptotique des estimateurs imputés pour l'imputation par régression déterministe et aléatoire. / Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including appropriate versions of resampling methods such as the jackknife and the bootstrap. We define the concept of doubly robust point and variance estimation under the so-called nonresponse and imputation model approaches. We focus on jackknife variance estimation, which is widely used in practice. We study the properties of several jackknife variance estimators under both deterministic and random regression imputation. We first consider the case of simple random sampling without replacement. The case of stratified simple random sampling and unequal probability sampling is also considered. A limited simulation study compares various jackknife variance estimators in terms of bias and relative stability when the sampling fraction is not negligible. Finally, the asymptotic normality of imputed estimator is established under both deterministic and random regression imputation.

Page generated in 0.1094 seconds