1 |
Optimal designs for statistical inferences in nonlinear models with bivariate response variablesHsu, Hsiang-Ling 27 January 2011 (has links)
Bivariate or multivariate correlated data may be collected on a sample of unit in many applications. When the experimenters concern about the failure times of two related subjects for example paired organs or two chronic diseases, the bivariate binary data is often acquired. This type of data consists of a observation point x and indicators which represent whether the failure times happened before or after the observation point. In this work, the observed bivariate data can be written with the following form {x, £_1=I(X1≤ x), £_2=I(X2≤ x)}.The corresponding optimal design problems for parameter estimation under this type of bivariate data are discussed.
For this kind of the multivariate responses with explanatory variables, their marginal distributions may be from different distributions. Copula model is a way to formulate the relationship of these responses, and the association between pairs of responses. Copula models for bivariate binary data are considered useful in practice due to its flexibility. In this dissertation for bivariate binary data, the marginal functions are assumed from exponential or Weibull distributions and two assumptions, independent or correlated, about the joint function between variables are considered. When the bivariate binary data is assumed correlated, the Clayton copula model is used as the joint cumulative distribution function.
There are few works addressed the optimal design problems for bivariate binary data with copula models. The D-optimal designs aim at minimizing the volume of the confidence ellipsoid for estimating unknown parameters including the association parameter in bivariate copula models. They are used to determine the best observation points. Moreover, the Ds-optimal designs are mainly used for estimation of the important association parameter in Clayton model.
The D- and Ds-optimal designs for the above copula model are found through the general equivalence theorem with numerical algorithm. Under different model assumptions, it is observed that the number of support points for D-optimal designs is at most as the number of model parameters for the numerical results. When the difference between the marginal distributions and the association are significant, the association becomes an influential factor which makes the number of supports gets larger.
The performances of estimation based on optimal designs are reasonably well by simulation studies. In survival experiments, the experimenter customarily takes trials at some specific points such as the position of the 25, 50 and 75 percentile of distributions. Hence, we consider the design efficiencies when the design points for trials are at three or four particular percentiles. Although it is common in practice to take trials at several quantile positions, the allocations of the proportion of sample size also have great influence on the experimental results.
To use a locally optimal design in practice, the prior information for models or parameters are needed. In case there is not enough prior knowledge about the models or parameters, it would be more flexible to use sequential experiments to obtain information in several stages. Hence with robustness consideration, a sequential procedure is proposed by combining D- and Ds-optimal designs under independent or correlated distribution in different stages of the experiment. The simulation results based on the sequential procedure are compared with those by the one step procedures. When the optimal designs obtained from an incorrect prior parameter values or distributions, those results may have poor efficiencies. The sample mean of estimators and corresponding optimal designs obtained from sequential procedure are close to the true values and the corresponding efficiencies are close to 1.
Huster (1989) analyzed the corresponding modeling problems for the paired survival data and applied to the Diabetic Retinopathy Study. Huster (1989) considered the exponential and Weibull distributions as possible marginal distributions and the Clayton model as the joint function for the Diabetic Retinopathy data. This data was conducted by the National Eye Institute to assess the effectiveness of laser photocoagulation in delaying the onset of blindness in patients with diabetic retinopathy. This study can be viewed as a prior experiment and provide the experimenter some useful guidelines for collecting data in future studies. As an application with Diabetic Retinopathy Study, we develop optimal designs to collect suitable data and information for estimating the unknown model parameters.
In the second part of this work, the optimal design problems for parameter estimations are considered for the type of proportional data. The nonlinear model, based on Jorgensen (1997) and named the dispersion model, provides a flexible class of non-normal distributions and is considered in this research. It can be applied in binary or count responses, as well as proportional outcomes. For continuous proportional data where responses are confined within the interval (0,1), the simplex dispersion model is considered here. D-optimal designs obtained through the corresponding equivalence theorem and the numerical results are presented. In the development of classical optimal design theory, weighted polynomial regression models with variance functions which depend on the explanatory variable have played an important role. The problem of constructing locally D-optimal designs for simplex dispersion model can be viewed as a weighted polynomial regression model with specific variance function. Due to the complex form of the weight function in the information matrix is considered as a rational function, an approximation of the weight function and the corresponding optimal designs are obtained with different parameters. These optimal designs are compared with those using the original weight function.
|
Page generated in 0.128 seconds