1 |
Choosing the proper link function for binary dataLi, Jingwei, active 21st century 08 October 2014 (has links)
Since generalized linear model (GLM) with binary response variable is widely used in many disciplines, many efforts have been made to construct a fit model. However, little attention is paid to the link functions, which play a critical role in GLM model. In this article, we compared three link functions and evaluated different model selection methods based on these three link functions. Also, we provided some suggestions on how to choose the proper link function for binary data. / text
|
2 |
Statistical discrimination with disease categories subject to misclassificationHilliam, Rachel M. January 2000 (has links)
No description available.
|
3 |
Model robust designs for binary response experimentsHuang, Shih-hao 06 July 2004 (has links)
The binary response experiments are often used in many areas. In many investigations, different kinds of optimal designs are discussed under an assumed model. There are also some discussions on optimal designs for discriminating models. The main goal in this work is to find an optimal design with two support points which minimizes the maximal probability differences between possible models from two types of symmetric location and scale families. It is called the minimum bias two-points design, or the $mB_2$ design in short here. D- and A-efficiencies of the $mB_2$ design obtained here are evaluated under an assumed model. Furthermore, when the assumed model is incorrect, the biases and the mean square errors in evaluating the true probabilities are computed and compared with that by using the D- and A-optimal designs for the incorrectly assumed model.
|
4 |
Limited Dependent Variable Correlated Random Coefficient Panel Data ModelsLiang, Zhongwen 2012 August 1900 (has links)
In this dissertation, I consider linear, binary response correlated random coefficient (CRC) panel data models and a truncated CRC panel data model which are frequently used in economic analysis. I focus on the nonparametric identification and estimation of panel data models under unobserved heterogeneity which is captured by random coefficients and when these random coefficients are correlated with regressors.
For the analysis of linear CRC models, I give the identification conditions for the average slopes of a linear CRC model with a general nonparametric correlation between regressors and random coefficients. I construct a sqrt(n) consistent estimator for the average slopes via varying coefficient regression.
The identification of binary response panel data models with unobserved heterogeneity is difficult. I base identification conditions and estimation on the framework of the model with a special regressor, which is a major approach proposed by Lewbel (1998, 2000) to solve the heterogeneity and endogeneity problem in the binary response models. With the help of the additional information on the special regressor, I can transfer a binary response CRC model to a linear moment relation. I also construct a semiparametric estimator for the average slopes and derive the sqrt(n)-normality result.
For the truncated CRC panel data model, I obtain the identification and estimation results based on the special regressor method which is used in Khan and Lewbel (2007). I construct a sqrt(n) consistent estimator for the population mean of the random coefficient. I also derive the asymptotic distribution of my estimator.
Simulations are given to show the finite sample advantage of my estimators. Further, I use a linear CRC panel data model to reexamine the return from job training. The results show that my estimation method really makes a difference, and the estimated return of training by my method is 7 times as much as the one estimated without considering the correlation between the covariates and random coefficients. It shows that on average the rate of return of job training is 3.16% per 60 hours training.
|
5 |
Three Essays on Comparative Simulation in Three-level Hierarchical Data StructureJanuary 2017 (has links)
abstract: Though the likelihood is a useful tool for obtaining estimates of regression parameters, it is not readily available in the fit of hierarchical binary data models. The correlated observations negate the opportunity to have a joint likelihood when fitting hierarchical logistic regression models. Through conditional likelihood, inferences for the regression and covariance parameters as well as the intraclass correlation coefficients are usually obtained. In those cases, I have resorted to use of Laplace approximation and large sample theory approach for point and interval estimates such as Wald-type confidence intervals and profile likelihood confidence intervals. These methods rely on distributional assumptions and large sample theory. However, when dealing with small hierarchical datasets they often result in severe bias or non-convergence. I present a generalized quasi-likelihood approach and a generalized method of moments approach; both do not rely on any distributional assumptions but only moments of response. As an alternative to the typical large sample theory approach, I present bootstrapping hierarchical logistic regression models which provides more accurate interval estimates for small binary hierarchical data. These models substitute computations as an alternative to the traditional Wald-type and profile likelihood confidence intervals. I use a latent variable approach with a new split bootstrap method for estimating intraclass correlation coefficients when analyzing binary data obtained from a three-level hierarchical structure. It is especially useful with small sample size and easily expanded to multilevel. Comparisons are made to existing approaches through both theoretical justification and simulation studies. Further, I demonstrate my findings through an analysis of three numerical examples, one based on cancer in remission data, one related to the China’s antibiotic abuse study, and a third related to teacher effectiveness in schools from a state of southwest US. / Dissertation/Thesis / Doctoral Dissertation Statistics 2017
|
6 |
A Simulation Study On Marginalized Transition Random Effects Models For Multivariate Longitudinal Binary DataYalcinoz, Zerrin 01 May 2008 (has links) (PDF)
In this thesis, a simulation study is held and a statistical model is fitted to the simulated data. This data is assumed to be the satisfaction of the customers who withdraw their salary from a particular bank. It is a longitudinal data which has bivariate and binary response. It is assumed to be collected from 200 individuals at four different time points. In such data sets, two types of dependence -the dependence within subject measurements and the dependence between responses- are important and these are considered in the model. The model is Marginalized Transition Random Effects Models, which has three levels. The first level measures the effect of covariates on responses, the second level accounts for temporal changes, and the third level measures the difference between individuals. Markov Chain Monte Carlo methods are used for the model fit. In the simulation study, the changes between the estimated values and true parameters are searched under two conditions, when the model is correctly specified or not. Results suggest that the better convergence is obtained with the full model. The third level which observes the individual changes is more sensitive to the model misspecification than the other levels of the model.
|
7 |
Essays on forecast evaluation and financial econometricsLund-Jensen, Kasper January 2013 (has links)
This thesis consists of three papers that makes independent contributions to the fields of forecast evaluation and financial econometrics. As such, the papers, chapter 1-3, can be read independently of each other. In Chapter 1, “Inferring an agent’s loss function based on a term structure of forecasts”, we provide conditions for identification, estimation and inference of an agent’s loss function based on an observed term structure of point forecasts. The loss function specification is flexible as we allow the preferences to be both asymmetric and to vary non-linearly across the forecast horizon. In addition, we introduce a novel forecast rationality test based on the estimated loss function. We employ the approach to analyse the U.S. Government’s preferences over budget surplus forecast errors. Interestingly, we find that it is relatively more costly for the government to underestimate the budget surplus and that this asymmetry is stronger at long forecast horizons. In Chapter 2, “Monitoring Systemic Risk”, we define systemic risk as the conditional probability of a systemic banking crisis. This conditional probability is modelled in a fixed effect binary response panel-model framework that allows for cross-sectional dependence (e.g. due to contagion effects). In the empirical application we identify several risk factors and it is shown that the level of systemic risk contains a predictable component which varies through time. Furthermore, we illustrate how the forecasts of systemic risk map into dynamic policy thresholds in this framework. Finally, by conducting a pseudo out-of-sample exercise we find that the systemic risk estimates provided reliable early-warning signals ahead of the recent financial crisis for several economies. Finally, in Chapter 3, “Equity Premium Predictability”, we reassess the evidence of out-of- sample equity premium predictability. The empirical finance literature has identified several financial variables that appear to predict the equity premium in-sample. However, Welch & Goyal (2008) find that none of these variables have any predictive power out-of-sample. We show that the equity premium is predictable out-of-sample once you impose certain shrinkage restrictions on the model parameters. The approach is motivated by the observation that many of the proposed financial variables can be characterised as ’weak predictors’ and this suggest that a James-Stein type estimator will provide a substantial risk reduction. The out-of-sample explanatory power is small, but we show that it is, in fact, economically meaningful to an investor with time-invariant risk aversion. Using a shrinkage decomposition we also show that standard combination forecast techniques tends to ’overshrink’ the model parameters leading to suboptimal model forecasts.
|
8 |
Inferência em um modelo de regressão com resposta binária na presença de sobredispersão e erros de medição / Inference in a regression model with overdispersed binary response and measurement errorsTieppo, Sandra Maria 15 February 2007 (has links)
Modelos de regressão com resposta binária são utilizados na solução de problemas nas mais diversas áreas. Neste trabalho enfocamos dois problemas comuns em certos conjuntos de dados e que requerem técnicas apropriadas que forneçam inferências satisfatórias. Primeiro, em certas aplicações uma mesma unidade amostral é utilizada mais de uma vez, acarretando respostas positivamente correlacionadas, responsáveis por uma variância na variável resposta superior ao que comporta a distribuição binomial, fenômeno conhecido como sobredispersão. Por outro lado, também encontramos situações em que a variável explicativa contém erros de medição. É sabido que utilizar técnicas que desconsideram esses erros conduz a resultados inadequados (estimadores viesados e inconsistentes, por exemplo). Considerando um modelo com resposta binária, utilizaremos a distribuição beta-binomial para representar a sobredispersão. Os métodos de máxima verossimilhança, SIMEX, calibração da regressão e máxima pseudo-verossimilhança foram usados na estimação dos parâmetros do modelo, que são comparados através de um estudo de simulação. O estudo de simulação sugere que os métodos de máxima verossimilhança e calibração da regressão são melhores no sentido de correção do viés, especialmente para amostras de tamanho 50 e 100. Também estudaremos testes de hipóteses assintóticos (como razão de verossimilhanças, Wald e escore) a fim de testar hipóteses de interesse. Apresentaremos também um exemplo com dados reais / Regression models with binary response are used for solving problems in several areas. In this work we approach two common problems in some data sets and they need appropriate techniques to achieve satisfactory inference. First, in some applications, the same sample unity is utilized more than once, bringing positively correlated responses, which are responsible for the response variable variance be greater than an assumption binomial distribution, phenomenon known as overdispersion. On the other hand, also we find situations where the explanatory variable has measurement errors. It is known that the use of techniques which ignores these measurement errors brings inadequate results (e. g., biased and inconsistent estimators). Taking a model with binary response, we will use a beta-binomial distribution for modeling the overdispersion. The methods of maximum likelihood, SIMEX, regression calibration and maximum pseudo-likelihood were used in the estimation of the parameters, which are compared through a simulation study. The simulation studies suggest that the maximum likelihood and regression calibration methods are better for bias correcting, especially for larger sample size. Likelihood ratio, Wald and score statistics are used in order to test hypothesis of interest. We will illustrate the techniques with an application to a real data set
|
9 |
Metodologia de previsão de recessões: um estudo econométrico com aplicações de modelos de resposta bináriaSaúde, Arthur Moreira 31 March 2017 (has links)
Submitted by Arthur Moreira Saude (arthur-moreira@hotmail.com) on 2017-04-27T16:03:53Z
No. of bitstreams: 1
Dissertacao Final.pdf: 947767 bytes, checksum: ca50219ab757930a6d88422c06d48234 (MD5) / Approved for entry into archive by GILSON ROCHA MIRANDA (gilson.miranda@fgv.br) on 2017-04-28T19:14:36Z (GMT) No. of bitstreams: 1
Dissertacao Final.pdf: 947767 bytes, checksum: ca50219ab757930a6d88422c06d48234 (MD5) / Made available in DSpace on 2017-05-02T19:31:50Z (GMT). No. of bitstreams: 1
Dissertacao Final.pdf: 947767 bytes, checksum: ca50219ab757930a6d88422c06d48234 (MD5)
Previous issue date: 2017-03-31 / This paper aims to create an econometric model capable of anticipating recessions in the United States economy, one year in advance, using not only monetary market variables that are already used by economists, but also capital market variables. Using a data span from 1959 to 2016, it was observed that the yield spread continues to be an explanatory variable with excellent predictive power over recessions. Evidence has also emerged of new variables that have very high statistical significance, and which offer valuable contributions to the regressions. Out-of-sample tests have been conducted which suggest that past recessions would have been predicted with substantially higher accuracy if the proposed Probit model had been used instead of the most widespread model in the economic literature. This accuracy is evident not only in the predictive quality, but also in the reduction of the number of false positives and false negatives in the regression, and in the robustness of the out-of-sample tests. / Este trabalho visa desenvolver um modelo econométrico capaz de antecipar, com um ano de antecedência, recessões na economia dos Estados Unidos, utilizando não só variáveis dos mercados monetários, que já são indicadores antecedentes bastante utilizados por economistas, mas também dos mercados de capitais. Utilizando-se dados de 1959 a 2016, pode-se observar que o spread de juros de longo e curto prazo continua sendo uma variável explicativa com excelente poder preditivo sobre recessões. Também surgiram evidências de novas variáveis que possuem altíssimas significâncias estatísticas, e que oferecem valiosas contribuições para as regressões. Foram conduzidos testes fora da amostra que sugerem que as recessões passadas teriam sido previstas com acurácia substancialmente superior, caso o modelo Probit proposto tivesse sido utilizado no lugar do modelo mais difundido na literatura econômica. Essa acurácia é evidente não só na qualidade preditiva, mas também na redução do número de falsos positivos e falsos negativos da regressão, e na robustez dos testes fora da amostra.
|
10 |
Inferência em um modelo de regressão com resposta binária na presença de sobredispersão e erros de medição / Inference in a regression model with overdispersed binary response and measurement errorsSandra Maria Tieppo 15 February 2007 (has links)
Modelos de regressão com resposta binária são utilizados na solução de problemas nas mais diversas áreas. Neste trabalho enfocamos dois problemas comuns em certos conjuntos de dados e que requerem técnicas apropriadas que forneçam inferências satisfatórias. Primeiro, em certas aplicações uma mesma unidade amostral é utilizada mais de uma vez, acarretando respostas positivamente correlacionadas, responsáveis por uma variância na variável resposta superior ao que comporta a distribuição binomial, fenômeno conhecido como sobredispersão. Por outro lado, também encontramos situações em que a variável explicativa contém erros de medição. É sabido que utilizar técnicas que desconsideram esses erros conduz a resultados inadequados (estimadores viesados e inconsistentes, por exemplo). Considerando um modelo com resposta binária, utilizaremos a distribuição beta-binomial para representar a sobredispersão. Os métodos de máxima verossimilhança, SIMEX, calibração da regressão e máxima pseudo-verossimilhança foram usados na estimação dos parâmetros do modelo, que são comparados através de um estudo de simulação. O estudo de simulação sugere que os métodos de máxima verossimilhança e calibração da regressão são melhores no sentido de correção do viés, especialmente para amostras de tamanho 50 e 100. Também estudaremos testes de hipóteses assintóticos (como razão de verossimilhanças, Wald e escore) a fim de testar hipóteses de interesse. Apresentaremos também um exemplo com dados reais / Regression models with binary response are used for solving problems in several areas. In this work we approach two common problems in some data sets and they need appropriate techniques to achieve satisfactory inference. First, in some applications, the same sample unity is utilized more than once, bringing positively correlated responses, which are responsible for the response variable variance be greater than an assumption binomial distribution, phenomenon known as overdispersion. On the other hand, also we find situations where the explanatory variable has measurement errors. It is known that the use of techniques which ignores these measurement errors brings inadequate results (e. g., biased and inconsistent estimators). Taking a model with binary response, we will use a beta-binomial distribution for modeling the overdispersion. The methods of maximum likelihood, SIMEX, regression calibration and maximum pseudo-likelihood were used in the estimation of the parameters, which are compared through a simulation study. The simulation studies suggest that the maximum likelihood and regression calibration methods are better for bias correcting, especially for larger sample size. Likelihood ratio, Wald and score statistics are used in order to test hypothesis of interest. We will illustrate the techniques with an application to a real data set
|
Page generated in 0.074 seconds