Spelling suggestions: "subject:"multinomial outcome"" "subject:"ultinomial outcome""
1 |
Specification, estimation and testing of treatment effects in multinomial outcome models : accommodating endogeneity and inter-category covarianceTang, Shichao 18 June 2018 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In this dissertation, a potential outcomes (PO) based framework is developed for
causally interpretable treatment effect parameters in the multinomial dependent variable
regression framework. The specification of the relevant data generating process (DGP) is
also derived. This new framework simultaneously accounts for the potential endogeneity
of the treatment and loosens inter-category covariance restrictions on the multinomial
outcome model (e.g., the independence from irrelevant alternatives restriction).
Corresponding consistent estimators for the “deep parameters” of the DGP and the
treatment effect parameters are developed and implemented (in Stata). A novel approach
is proposed for assessing the inter-category covariance flexibility afforded by a particular
multinomial modeling specification [e.g. multinomial logit (MNL), multinomial probit
(MNP), and nested multinomial logit (NMNL)] in the context of our general framework.
This assessment technique can serve as a useful tool for model selection. The new
modeling/estimation approach developed in this dissertation is quite general. I focus here,
however, on the NMNL model because, among the three modeling specifications under
consideration (MNL, MNP and NMNL), it is the only one that is both computationally
feasible and is relatively unrestrictive with regard to inter-category covariance. Moreover,
as a logical starting point, I restrict my analyses to the simplest version of the model – the
trinomial (three-category) NMNL with an endogenous treatment (ET) variable conditioned
on individual-specific covariates only. To identify potential computational issues and to assess the statistical accuracy of my proposed NMNL-ET estimator and its implementation
(in Stata), I conducted a thorough simulation analysis. I found that conventional
optimization techniques are, in this context, generally fraught with convergence problems.
To overcome this, I implement a systematic line search algorithm that successfully resolves
this issue. The simulation results suggest that it is important to accommodate both
endogeneity and inter-category covariance simultaneously in model design and estimation.
As an illustration and as a basis for comparing alternative parametric specifications with
respect to ease of implementation, computational efficiency and statistical performance,
the proposed model and estimation method are used to analyze the impact of substance
abuse/dependence on the employment status using the National Epidemiologic Survey on
Alcohol and Related Conditions (NESARC) data.
|
2 |
Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical DataSchwartz, Sarah E. 01 December 2017 (has links)
Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.
One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, better/worse, ect.) is logistic regression. While some potential complications with this approach are widely known, many investigators are unaware that their particular data does not meet the foundational assumptions, since they are not easy to verify. We have developed a routine for determining if a researcher should be concerned about potential bias in logistic regression results, so they can take steps to mitigate the bias or use a different procedure altogether to model the data.
Correlated data may arise from common situations such as multi-site medical studies, research on family units, or investigations on student achievement within classrooms. In these circumstance the associations between cluster members must be included in any statistical analysis testing the hypothesis of a connection be-tween two variables in order for results to be valid.
Previously investigators had to choose between using a method intended for small or sparse data while assuming independence between observations or a method that allowed for correlation between observations, while requiring large samples to be reliable. We present a new method that allows for small, clustered samples to be assessed for a relationship between a two-level predictor (eg. treatment/control) and a categorical outcome (eg. low/medium/high).
|
Page generated in 0.0542 seconds