• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 5
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 30
  • 30
  • 9
  • 8
  • 8
  • 7
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Monitoring Markov Dependent Binary Observations with a Log-Likelihood Ratio Based CUSUM Control Chart

Modarres-Mousavi, Shabnam 04 April 2006 (has links)
Our objective is to monitor the changes in a proportion with correlated binary observations. All of the published work on this subject used the first-order Markov chain model for the data. Increasing the order of dependence above one by extending a standard Markov chain model entails an exponential increase of both the number of parameters and the dimension of the transition probability matrix. In this dissertation, we develop a particular Markov chain structure, the Multilevel Model (MLM), to model the correlation between binary data. The basic idea is to assign a lower probability to observing a 1 when all previous correlated observations are 0's, and a higher probability to observing a 1 as the last observed 1 gets closer to the current observation. We refer to each of the distinct situations of observing a 1 as a "level". For a given order of dependence, , at most different values of conditional probabilities of observing a 1 can be assigned. So the number of levels is always less than or equal to . Compared to a direct extension of the first-order Markov model to higher orders, our model is considerably parsimonious. The number of parameters for the MLM is only one plus the number of levels, and the transition probability matrix is . We construct a CUSUM control chart for monitoring a proportion with correlated binary observations. First, we use the probability structure of a first-order Markov chain to derive a log-likelihood ratio based CUSUM control statistic. Then, we model this CUSUM statistic itself as a Markov chain, which in turn allows for designing a control chart with specified statistical properties: the Markov Binary CUSUM (MBCUSUM) chart. We generalize the MBCUSUM to account for any order of dependence between binary observations through implying MLM to the data and to our CUSUM control statistic. We verify that the MBCUSUM has a better performance than a curtailed Shewhart chart. Also, we show that except for extremely large changes in the proportion (of interest) the MBCUSUM control chart detects the changes faster than the Bernoulli CUSUM control chart, which is designed for independent observations. / Ph. D.
12

Generalized Principal Component Analysis: Dimensionality Reduction through the Projection of Natural Parameters

Landgraf, Andrew J. 15 October 2015 (has links)
No description available.
13

Bayesian Probit Regression Models for Spatially-Dependent Categorical Data

Berrett, Candace 02 November 2010 (has links)
No description available.
14

Analysis of Binary Data via Spatial-Temporal Autologistic Regression Models

Wang, Zilong 01 January 2012 (has links)
Spatial-temporal autologistic models are useful models for binary data that are measured repeatedly over time on a spatial lattice. They can account for effects of potential covariates and spatial-temporal statistical dependence among the data. However, the traditional parametrization of spatial-temporal autologistic model presents difficulties in interpreting model parameters across varying levels of statistical dependence, where its non-negative autocovariates could bias the realizations toward 1. In order to achieve interpretable parameters, a centered spatial-temporal autologistic regression model has been developed. Two efficient statistical inference approaches, expectation-maximization pseudo-likelihood approach (EMPL) and Monte Carlo expectation-maximization likelihood approach (MCEML), have been proposed. Also, Bayesian inference is considered and studied. Moreover, the performance and efficiency of these three inference approaches across various sizes of sampling lattices and numbers of sampling time points through both simulation study and a real data example have been studied. In addition, We consider the imputation of missing values is for spatial-temporal autologistic regression models. Most existing imputation methods are not admissible to impute spatial-temporal missing values, because they can disrupt the inherent structure of the data and lead to a serious bias during the inference or computing efficient issue. Two imputation methods, iteration-KNN imputation and maximum entropy imputation, are proposed, both of them are relatively simple and can yield reasonable results. In summary, the main contributions of this dissertation are the development of a spatial-temporal autologistic regression model with centered parameterization, and proposal of EMPL, MCEML, and Bayesian inference to obtain the estimations of model parameters. Also, iteration-KNN and maximum entropy imputation methods have been presented for spatial-temporal missing data, which generate reliable imputed values with the reasonable efficient imputation time.
15

Linear programming algorithms for detecting separated data in binary logistic regression models

Konis, Kjell Peter January 2007 (has links)
This thesis is a study of the detection of separation among the sample points in binary logistic regression models. We propose a new algorithm for detecting separation and demonstrate empirically that it can be computed fast enough to be used routinely as part of the fitting process for logistic regression models. The parameter estimates of a binary logistic regression model fit using the method of maximum likelihood sometimes do not converge to finite values. This phenomenon (also known as monotone likelihood or infinite parameters) occurs because of a condition among the sample points known as separation. There are two classes of separation. When complete separation is present among the sample points, iterative procedures for maximizing the likelihood tend to break down, when it would be clear that there is a problem with the model. However, when quasicomplete separation is present among the sample points, the iterative procedures for maximizing the likelihood tend to satisfy their convergence criterion before revealing any indication of separation. The new algorithm is based on a linear program with a nonnegative objective function that has a positive optimal value when separation is present among the sample points. We compare several approaches for solving this linear program and find that a method based on determining the feasibility of the dual to this linear program provides a numerically reliable test for separation among the sample points. A simulation study shows that this test can be computed in a similar amount of time as fitting the binary logistic regression model using the method of iteratively reweighted least squares: hence the test is fast enough to be used routinely as part of the fitting procedure. An implementation of our algorithm (as well as the other methods described in this thesis) is available in the R package safeBinaryRegression.
16

THE APPLICATION OF LAST OBSERVATION CARRIED FORWARD (LOCF) IN THE PERSISTENT BINARY CASE

He, Jun 01 January 2014 (has links)
The main purpose of this research was to evaluate use of Last Observation Carried Forward (LOCF) as an imputation method when persistent binary outcomes are missing in a Randomized Controlled Trial. A simulation study was performed to see the effect of dropout rate and type of dropout (random or associated with treatment arm) on Type I error and power. Properties of estimated event rates, treatment effect, and bias were also assessed. LOCF was also compared to two versions of complete case analysis - Complete1 (excluding all observations with missing data), and Complete2 (only carrying forward observations if the event is observed to occur). LOCF was not recommended because of the bias. Type I error was increased, and power was decreased. The other two analyses also had poor properties. LOCF analysis was applied to a mammogram dataset, with results similar to the simulation study.
17

Modelos bayesianos semi-paramétricos para dados binários / Bayesian semi-parametric models for binary data

Diniz, Márcio Augusto 11 June 2015 (has links)
Este trabalho propõe modelos Bayesiano semi-paramétricos para dados binários. O primeiro modelo é uma mistura em escala que permite lidar com discrepâncias relacionadas a curtose do modelo Logístico. É uma extensão relevante a partir do que já foi proposto por Basu e Mukhopadhyay (2000) ao possibilitar a interpretação da distribuição a priori dos parâmetros através de razões de chances. O segundo modelo usufrui da mistura em escala em conjunto com a transformação proposta por \\Yeo e Johnson (2000) possibilitando que a curtose assim como a assimetria sejam ajustadas e um parâmetro informativo de assimetria seja estimado. Esta transformação é muito mais apropriada para lidar com valores negativos do que a transformação de Box e Cox (1964) utilizada por Guerrero e Johnson (1982) e é mais simples do que o modelo proposto por Stukel (1988). Por fim, o terceiro modelo é o mais geral entre todos e consiste em uma mistura de posição e escala tal que possa descrever curtose, assimetria e também bimodalidade. O modelo proposto por Newton et al. (1996), embora, seja bastante geral, não permite uma interpretação palpável da distribuição a priori para os pesquisadores da área aplicada. A avaliação dos modelos é realizada através de medidas de distância de probabilidade Cramér-von Mises, Kolmogorov-Smirnov e Anderson-Darling e também pelas Ordenadas Preditivas Condicionais. / This work proposes semi-parametric Bayesian models for binary data. The first model is a scale mixture that allows handling discrepancies related to kurtosis of Logistic model. It is a more interesting extension than has been proposed by Basu e Mukhopadyay (1998) because this model allows the interpretation of the prior distribution of parameters using odds ratios. The second model enjoys the scale mixture together with the scale transformation proposed by Yeo and Johnson (2000) modeling the kurtosis and the asymmetry such that a parameter of asymmetry is estimated. This transformation is more appropriate to deal with negative values than the transformation of Box e Cox (1964) used by Guerrero e Johnson (1982) and simpler than the model proposed by Stukel (1988). Finally, the third model is the most general among all and consists of a location-scale mixture that can describe kurtosis and skewness also bimodality. The model proposed by Newton et al (1996), although general, does not allow a tangible interpretation of the a priori distribution for reseachers of applied area. The evaluation of the models is performed through distance measurements of distribution of probabilities Cramer-von Mises Kolmogorov-Smirnov and Anderson-Darling and also the Conditional Predictive sorted.
18

Optimal Design and Inference for Correlated Bernoulli Variables using a Simplified Cox Model

Bruce, Daniel January 2008 (has links)
<p>This thesis proposes a simplification of the model for dependent Bernoulli variables presented in Cox and Snell (1989). The simplified model, referred to as the simplified Cox model, is developed for identically distributed and dependent Bernoulli variables.</p><p>Properties of the model are presented, including expressions for the loglikelihood function and the Fisher information. The special case of a bivariate symmetric model is studied in detail. For this particular model, it is found that the number of design points in a locally D-optimal design is determined by the log-odds ratio between the variables. Under mutual independence, both a general expression for the restrictions of the parameters and an analytical expression for locally D-optimal designs are derived.</p><p>Focusing on the bivariate case, score tests and likelihood ratio tests are derived to test for independence. Numerical illustrations of these test statistics are presented in three examples. In connection to testing for independence, an E-optimal design for maximizing the local asymptotic power of the score test is proposed.</p><p>The simplified Cox model is applied to a dental data. Based on the estimates of the model, optimal designs are derived. The analysis shows that these optimal designs yield considerably more precise parameter estimates compared to the original design. The original design is also compared against the E-optimal design with respect to the power of the score test. For most alternative hypotheses the E-optimal design provides a larger power compared to the original design.</p>
19

Optimal Design and Inference for Correlated Bernoulli Variables using a Simplified Cox Model

Bruce, Daniel January 2008 (has links)
This thesis proposes a simplification of the model for dependent Bernoulli variables presented in Cox and Snell (1989). The simplified model, referred to as the simplified Cox model, is developed for identically distributed and dependent Bernoulli variables. Properties of the model are presented, including expressions for the loglikelihood function and the Fisher information. The special case of a bivariate symmetric model is studied in detail. For this particular model, it is found that the number of design points in a locally D-optimal design is determined by the log-odds ratio between the variables. Under mutual independence, both a general expression for the restrictions of the parameters and an analytical expression for locally D-optimal designs are derived. Focusing on the bivariate case, score tests and likelihood ratio tests are derived to test for independence. Numerical illustrations of these test statistics are presented in three examples. In connection to testing for independence, an E-optimal design for maximizing the local asymptotic power of the score test is proposed. The simplified Cox model is applied to a dental data. Based on the estimates of the model, optimal designs are derived. The analysis shows that these optimal designs yield considerably more precise parameter estimates compared to the original design. The original design is also compared against the E-optimal design with respect to the power of the score test. For most alternative hypotheses the E-optimal design provides a larger power compared to the original design.
20

Optimal designs for statistical inferences in nonlinear models with bivariate response variables

Hsu, Hsiang-Ling 27 January 2011 (has links)
Bivariate or multivariate correlated data may be collected on a sample of unit in many applications. When the experimenters concern about the failure times of two related subjects for example paired organs or two chronic diseases, the bivariate binary data is often acquired. This type of data consists of a observation point x and indicators which represent whether the failure times happened before or after the observation point. In this work, the observed bivariate data can be written with the following form {x, £_1=I(X1≤ x), £_2=I(X2≤ x)}.The corresponding optimal design problems for parameter estimation under this type of bivariate data are discussed. For this kind of the multivariate responses with explanatory variables, their marginal distributions may be from different distributions. Copula model is a way to formulate the relationship of these responses, and the association between pairs of responses. Copula models for bivariate binary data are considered useful in practice due to its flexibility. In this dissertation for bivariate binary data, the marginal functions are assumed from exponential or Weibull distributions and two assumptions, independent or correlated, about the joint function between variables are considered. When the bivariate binary data is assumed correlated, the Clayton copula model is used as the joint cumulative distribution function. There are few works addressed the optimal design problems for bivariate binary data with copula models. The D-optimal designs aim at minimizing the volume of the confidence ellipsoid for estimating unknown parameters including the association parameter in bivariate copula models. They are used to determine the best observation points. Moreover, the Ds-optimal designs are mainly used for estimation of the important association parameter in Clayton model. The D- and Ds-optimal designs for the above copula model are found through the general equivalence theorem with numerical algorithm. Under different model assumptions, it is observed that the number of support points for D-optimal designs is at most as the number of model parameters for the numerical results. When the difference between the marginal distributions and the association are significant, the association becomes an influential factor which makes the number of supports gets larger. The performances of estimation based on optimal designs are reasonably well by simulation studies. In survival experiments, the experimenter customarily takes trials at some specific points such as the position of the 25, 50 and 75 percentile of distributions. Hence, we consider the design efficiencies when the design points for trials are at three or four particular percentiles. Although it is common in practice to take trials at several quantile positions, the allocations of the proportion of sample size also have great influence on the experimental results. To use a locally optimal design in practice, the prior information for models or parameters are needed. In case there is not enough prior knowledge about the models or parameters, it would be more flexible to use sequential experiments to obtain information in several stages. Hence with robustness consideration, a sequential procedure is proposed by combining D- and Ds-optimal designs under independent or correlated distribution in different stages of the experiment. The simulation results based on the sequential procedure are compared with those by the one step procedures. When the optimal designs obtained from an incorrect prior parameter values or distributions, those results may have poor efficiencies. The sample mean of estimators and corresponding optimal designs obtained from sequential procedure are close to the true values and the corresponding efficiencies are close to 1. Huster (1989) analyzed the corresponding modeling problems for the paired survival data and applied to the Diabetic Retinopathy Study. Huster (1989) considered the exponential and Weibull distributions as possible marginal distributions and the Clayton model as the joint function for the Diabetic Retinopathy data. This data was conducted by the National Eye Institute to assess the effectiveness of laser photocoagulation in delaying the onset of blindness in patients with diabetic retinopathy. This study can be viewed as a prior experiment and provide the experimenter some useful guidelines for collecting data in future studies. As an application with Diabetic Retinopathy Study, we develop optimal designs to collect suitable data and information for estimating the unknown model parameters. In the second part of this work, the optimal design problems for parameter estimations are considered for the type of proportional data. The nonlinear model, based on Jorgensen (1997) and named the dispersion model, provides a flexible class of non-normal distributions and is considered in this research. It can be applied in binary or count responses, as well as proportional outcomes. For continuous proportional data where responses are confined within the interval (0,1), the simplex dispersion model is considered here. D-optimal designs obtained through the corresponding equivalence theorem and the numerical results are presented. In the development of classical optimal design theory, weighted polynomial regression models with variance functions which depend on the explanatory variable have played an important role. The problem of constructing locally D-optimal designs for simplex dispersion model can be viewed as a weighted polynomial regression model with specific variance function. Due to the complex form of the weight function in the information matrix is considered as a rational function, an approximation of the weight function and the corresponding optimal designs are obtained with different parameters. These optimal designs are compared with those using the original weight function.

Page generated in 0.0397 seconds