201 |
Improving the Efficiency of Tests and Estimators of Treatment Effect with Auxiliary Covariates in the Presence of CensoringLu, Xiaomin 30 May 2007 (has links)
In most randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients because of censoring such as administrative censoring that occurs when the study ends before all the patients had the opportunity to observe their response but also censoring may result from patient dropout. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimate or test for the treatment effect may be biased. In this paper, we considered two situations. In the first situation, we only consider the special case where the censoring time is noninformative and the survival time itself the time-lagged response. We use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the unconditional log-hazard ratio parameter. The prognostic auxiliary covariates are used to derive estimators that are more efficient than the traditional maximum partial likelihood estimator and the corresponding Wald tests are more powerful than the logrank test. In the second situation, we extended the results under the first situation to a general case where the censoring time can be informative and the time-lagged response can be any type. We also use the semiparametric theory to derive a class of consistent and asymptotic estimator for the treatment effect estimator. The prognostic baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.
|
202 |
Sparse Estimation and Inference for Censored Median RegressionShows, Justin Hall 20 July 2009 (has links)
Censored median regression models have been shown to be useful for analyzing a variety of censored survival data with the robustness property. We study sparse estimation and inference of censored median regression. The new method minimizes an inverse censoring probability weighted least absolute deviation subject to the adaptive LASSO penalty. We show that, with a proper choice of the tuning parameter, the proposed estimator has nice theoretical properties such as root-n consistency and asymptotic normality. The estimator can also identify the underlying sparse model consistently. We propose using a resampling method to estimate the variance of the proposed estimator. Furthermore, the new procedure enjoys great advantages in computation, since its entire solution path can be obtained efficiently. Also, the method can be extended to multivariate survival data, where there is a natural or artificial clustering structure. The performance of our estimator is evaluated by extensive simulations and two real data applications.
|
203 |
Variable Selection in Linear Mixed Model for Longitudinal DataLan, Lan 19 May 2006 (has links)
Fan and Li (JASA, 2001) proposed a family of variable selection procedures for certain parametric models via a nonconcave penalized likelihood approach, where significant variable selection and parameter estimation were done simultaneously, and the procedures were shown to have the oracle property. In this presentation, we extend the nonconcave penalized likelihood approach to linear mixed models for longitudinal data. Two new approaches are proposed to select significant covariates and estimate fixed effect parameters and variance components. In particular, we show the new approaches also possess the oracle property when the tuning parameter is chosen appropriately. We assess the performance of the proposed approaches via simulation and apply the procedures to data from the Multicenter AIDS Cohort Study.
|
204 |
A Stationary Stochastic Approximation Algorithm for Estimation in the GLMMChang, Sheng-Mao 18 May 2007 (has links)
Estimation in generalized linear mixed models is challenging because the marginal likelihood is an integral without closed form. In many of the leading approaches such as Laplace approximation and Monte Carlo integration, the marginal likelihood is approximated, and the maximum likelihood estimate (MLE) can only be reached with error. An alternative, the simultaneous perturbation stochastic approximation (SPSA) algorithm is designed to maximize an integral and can be employed to find the exact MLE under the same circumstances. However, the SPSA does not directly provide an error estimate if the algorithm is stopped in a number of finite steps. In order to estimate the MLE properly with an statistical error bound, we propose the stationary SPSA (SSPSA) algorithm. Assuming that the marginal likelihood, objective function, is quadratic around the MLE, the SSPSA takes the form of a random coefficient vector autoregressive process. Under mild conditions, the algorithm yields a strictly stationary sequence where the mean of this sequence is asymptotically unbiased to the MLE and has a closed-form variance. Also, the SSPSA sequence is ergodic providing certain constraints on the step size, a parameter of the algorithm, and the mechanism that directs the algorithm to search the parameter space. Sufficient conditions for the stationarity and ergodicity are provided as a guideline for choosing the step size. Several implementation issues are addressed in the thesis: pairing numerical derivative, scaling, and importance sampling. Following the simulation study, we apply the SSPSA on several GLMMs: Epilepsy seizure data, lung cancer data, and salamander mating data. For the first two cases, SSPSA estimates are similar to published results whereas, for the salamander data, our solution greatly differs from others.
|
205 |
Statistical inference for correlated data based on censored observationsPARK, JUNG WOOK 14 June 2005 (has links)
Many physical quantities measured over time and space are often observed with data irregularities, such as truncation (detection limit) or censoring. Practitioners often disregard censored data cases which may result in inefficient estimates. On the other hand, censored data treated as observed values will lead to biased estimates. For instance, the data values collected by a monitoring device may have a specific detection limit and the device records the value with its limit, or a constant exceeding the limit value, when the real value exceeds the limit. We present an attractive remedy for handling censored or truncated data collected over time or space. Our method produces (asymptotically) unbiased estimates that are more efficient than the estimates based on treating censored observations as completely observed. In particular, we introduce an imputation method particularly well suited for fitting statistical models dealing with correlated observations in the presence of censored data. Our proposed imputation method involves generating random samples from the conditional distribution of the censored data given the (completely) observed data and current estimates of the parameters. The parameter estimates are then updated based on imputed and completely observed data until convergence. Under Gaussian processes, such a conditional distribution turns out to be a truncated multivariate normal distribution. We use a Gibbs sampling method to generate samples from such truncated multivariate normal distributions. We demonstrate the effectiveness of the technique for a problem common to many correlated data sets and describe its application to several other frequently encountered situations. First, we discuss the use of an imputation technique for a stationary time series data assuming an autoregressive moving average model. Then, we relax the model assumption and discuss how the imputation method works with a nonparametric estimation of a covariance matrix. The use of the imputation method is not limited to a time series model and can be applied to other types of correlated data such as a spatial data. A lattice model is discussed as another application field of the imputation method. For pedagogic purposes, our illustration of the approach based on a simulation study is limited to some simple models such as a first order autoregressive time series model, first order moving average time series model, and first order simultaneous autoregressive error model, with left or right censoring. However, the method can easily be extended to more complicated models. We also derive the Fisher information matrix for an AR(1) process containing censored observations and explain the effect of the censoring on the efficiency gain of the estimates using the trace of the Fisher Information matrix.
|
206 |
Multivariate Spatial Temporal Statistical Models for Applications in Coastal Ocean PredictionFoley, Kristen Madsen 06 July 2006 (has links)
Estimating the spatial and temporal variation of surface wind fields plays an important role in modeling atmospheric and oceanic processes. This is particularly true for hurricane forecasting, where numerical ocean models are used to predict the height of the storm surge and the degree of coastal flooding. We use multivariate spatial-temporal statistical methods to improve coastal storm surge prediction using disparate sources of observation data. An Ensemble Kalman Filter is used to assimilate water elevation into a three dimension primitive equations ocean model. We find that data assimilation is able to improve the estimates for water elevation for a case study of Hurricane Charley of 2004. In addition we investigate the impact of inaccuracies in the wind field inputs which are the main forcing of the numerical model in storm surge applications. A new multivariate spatial statistical framework is developed to improve the estimation of these wind inputs. A spatial linear model of coregionalization (LMC) is used to account for the cross-dependency between the two orthogonal wind components. A Bayesian approach is used for estimation of the parameters of the multivariate spatial model and a physically based wind model while accounting for potential additive and multiplicative bias in the observed wind data. This spatial model consistently improves parameter estimation and prediction for surface wind data for the Hurricane Charley case study when compared to the original physical wind model. These methods are also shown to improve storm surge estimates when used as the forcing fields for the coastal ocean model. Finally we describe a new framework for estimating multivariate nonstationary spatial-temporal processes based on an extension of the LMC model. We compare this approach to other multivariate spatial models and describe an application to surface wind fields from Hurricane Floyd of 1999.
|
207 |
Numerical Differentiation Using Statistical DesignBodily, Chris H 18 July 2002 (has links)
Derivatives are frequently required by numerical procedures across many disciplines. Numerical differentiation can be useful for approximating derivatives. This dissertation will introduce computational differentiation (the process by which derivatives are obtained with a computer), focusing on statistical response surface (RSM) designs for approximating derivatives. The RSM designs are compared with two competing numerical methods: namely a rival saturated statistical design approach, and a method employing finite differencing. A covariance model incorporating function curvature and computer round-off error is proposed for estimating the derivative approximation variances. These variances and the computational workload each method requires become the basis for comparing the derivative approximations. A diagnostic test for variable scaling errors is also described.
|
208 |
Quantifying local creation and regional transport using a hierarchical space-time model of ozone as a function of observed NOx, a latent space-time VOC process, emissions, and meteorologyNail, Amy Jeanette 20 August 2007 (has links)
We explore the ability of a space-time model to decompose the 8-hour ozone concentration on a given day at a given site into the parts attributable to local emissions and regional transport, and ultimately to assess the efficacy of past and future emission control programs. We model ozone as created plus transported ozone plus an error term that has a seasonally varying spatial covariance. The created component uses atmospheric chemistry results to express ozone created on a given day at a given site as a function of the observed NOx concentration, the latent VOC concentration, and temperature. The ozone transported to a given day at a given site is expressed as a weighted average of the ozone observed at all sites on the previous day, where the weights are a function of wind speed and direction that appropriately distribute weight across redundant information. The latent VOC process model has a mean trend that includes emissions from various source types, temperature, a workday indicator variable, and an error term that has a seasonally varying spatial covariance. We fit the model using likelihood methods, and we compare our predictions to observations from a withheld dataset and to those predictions of CMAQ, the deterministic model used by EPA to assess emission control programs. We find that the model predictions based on the mean trend and the random deviations from this mean outperform CMAQ predictions according to multiple criteria, but predictions based on the mean trend alone underperform CMAQ predictions.
|
209 |
Robust Estimation via Measurement Error ModelingWang, Qiong 16 August 2005 (has links)
We introduce a new method to robustifying inference that can be applied in any situation where a parametric likelihood is available. The key feature is that data from the postulated parametric models are assumed to be measured with error where the measurement error distribution is chosen to produce the occasional gross errors found in data. We show that the tails of the error-contamination model control the properties (boundedness, redescendingness) of the resulting influence functions, with heavier tails in the error contamination model producing more robust estimators. In the application to location-scale models with independent and identically distributed data, the resulting analytically-intractable likelihoods are approximated via Monte Carlo integration. In the application to time series models, we propose a Bayesian approach to the robust estimation of time series parameters. We use Markov Chain Monte Carlo (MCMC) to estimate the parameters of interest and also the gross errors. The latter are used as outlier diagnostics.
|
210 |
Orthology-Based Multilevel Modeling of Differentially Expressed Mouse and Human Gene PairsOgorek, Benjamin Alexander 21 August 2008 (has links)
There is great interest in finding human genes expressed through pharmaceutical intervention, thus opening a genomic window into benefit and side-effect profiles of a drug. Human insight gained from FDA-required animal experiments has historically been limited, but in the case of gene expression measurements, proposed biological orthologies between mouse and human genes provide a foothold for animal-to-human extrapolation. We have investigated a five-component, multilevel, bivariate normal mixture model that incorporates mouse, as well as human, gene expression data. The goal is two-fold: to increase human differential gene-finding power; and to find a subclass of gene pairs for which there is a direct exploitable relationship between animal and human genes. In simulation studies, the dual-species model boasted impressive gains in differential gene-finding power over a related marginal model using only human data. Bias in parameter estimation was problematic, however, and occasionally led to failures in control of the false discovery rate. Though it was considerably more difficult to find species-extrapolative gene-pairs (than differentially expressed human genes), simulation experiments deemed it to be possible, especially when traditional FDR controls are relaxed and under hypothetical parameter configurations.
|
Page generated in 0.0794 seconds