1 |
Optimal Design and Inference for Correlated Bernoulli Variables using a Simplified Cox ModelBruce, Daniel January 2008 (has links)
<p>This thesis proposes a simplification of the model for dependent Bernoulli variables presented in Cox and Snell (1989). The simplified model, referred to as the simplified Cox model, is developed for identically distributed and dependent Bernoulli variables.</p><p>Properties of the model are presented, including expressions for the loglikelihood function and the Fisher information. The special case of a bivariate symmetric model is studied in detail. For this particular model, it is found that the number of design points in a locally D-optimal design is determined by the log-odds ratio between the variables. Under mutual independence, both a general expression for the restrictions of the parameters and an analytical expression for locally D-optimal designs are derived.</p><p>Focusing on the bivariate case, score tests and likelihood ratio tests are derived to test for independence. Numerical illustrations of these test statistics are presented in three examples. In connection to testing for independence, an E-optimal design for maximizing the local asymptotic power of the score test is proposed.</p><p>The simplified Cox model is applied to a dental data. Based on the estimates of the model, optimal designs are derived. The analysis shows that these optimal designs yield considerably more precise parameter estimates compared to the original design. The original design is also compared against the E-optimal design with respect to the power of the score test. For most alternative hypotheses the E-optimal design provides a larger power compared to the original design.</p>
|
2 |
Optimal Design and Inference for Correlated Bernoulli Variables using a Simplified Cox ModelBruce, Daniel January 2008 (has links)
This thesis proposes a simplification of the model for dependent Bernoulli variables presented in Cox and Snell (1989). The simplified model, referred to as the simplified Cox model, is developed for identically distributed and dependent Bernoulli variables. Properties of the model are presented, including expressions for the loglikelihood function and the Fisher information. The special case of a bivariate symmetric model is studied in detail. For this particular model, it is found that the number of design points in a locally D-optimal design is determined by the log-odds ratio between the variables. Under mutual independence, both a general expression for the restrictions of the parameters and an analytical expression for locally D-optimal designs are derived. Focusing on the bivariate case, score tests and likelihood ratio tests are derived to test for independence. Numerical illustrations of these test statistics are presented in three examples. In connection to testing for independence, an E-optimal design for maximizing the local asymptotic power of the score test is proposed. The simplified Cox model is applied to a dental data. Based on the estimates of the model, optimal designs are derived. The analysis shows that these optimal designs yield considerably more precise parameter estimates compared to the original design. The original design is also compared against the E-optimal design with respect to the power of the score test. For most alternative hypotheses the E-optimal design provides a larger power compared to the original design.
|
3 |
Models for fitting correlated non-identical bernoulli random variables with applications to an airline data problemPerez Romo Leroux, Andres January 2021 (has links)
Our research deals with the problem of devising models for fitting non- identical dependent Bernoulli variables and using these models to predict fu- ture Bernoulli trials.We focus on modelling and predicting random Bernoulli response variables which meet all of the following conditions:
1. Each observed as well as future response corresponds to a Bernoulli trial
2. The trials are non-identical, having possibly different probabilities of occurrence
3. The trials are mutually correlated, with an underlying complex trial cluster correlation structure. Also allowing for the possible partitioning of trials within clusters into groups. Within cluster - group level correlation is reflected in the correlation structure.
4. The probability of occurrence and correlation structure for both ob- served and future trials can depend on a set of observed covariates.
A number of proposed approaches meeting some of the above conditions are present in the current literature. Our research expands on existing statistical and machine learning methods.
We propose three extensions to existing models that make use of the above conditions. Each proposed method brings specific advantages for dealing with
correlated binary data. The proposed models allow for within cluster trial grouping to be reflected in the correlation structure. We partition sets of trials into groups either explicitly estimated or implicitly inferred. Explicit groups arise from the determination of common covariates; inferred groups arise via imposing mixture models. The main motivation of our research is in modelling and further understanding the potential of introducing binary trial group level correlations. In a number of applications, it can be beneficial to use models that allow for these types of trial groupings, both for improved predictions and better understanding of behavior of trials.
The first model extension builds on the Multivariate Probit model. This model makes use of covariates and other information from former trials to determine explicit trial groupings and predict the occurrence of future trials. We call this the Explicit Groups model.
The second model extension uses mixtures of univariate Probit models. This model predicts the occurrence of current trials using estimators of pa- rameters supporting mixture models for the observed trials. We call this the Inferred Groups model.
Our third methods extends on a gradient descent based boosting algorithm which allows for correlation of binary outcomes called WL2Boost. We refer to our extension of this algorithm as GWL2Boost.
Bernoulli trials are divided into observed and future trials; with all trials having associated known covariate information. We apply our methodology to the problem of predicting the set and total number of passengers who will not show up on commercial flights using covariate information and past passenger data.
The models and algorithms are evaluated with regards to their capac- ity to predict future Bernoulli responses. We compare the models proposed against a set of competing existing models and algorithms using available air- line passenger no-show data. We show that our proposed algorithm extension GWL2Boost outperforms top existing algorithms and models that assume in- dependence of binary outcomes in various prediction metrics. / Statistics
|
Page generated in 0.0819 seconds