Spelling suggestions: "subject:"[een] CAUSAL INFERENCE"" "subject:"[enn] CAUSAL INFERENCE""
1 |
Estimating Individual Causal EffectsLam, Patrick Kenneth 18 October 2013 (has links)
Most empirical work focuses on the estimation of average treatment effects (ATE). In this dissertation, I argue for a different way of thinking about causal inference by estimating individual causal effects (ICEs). I argue that focusing on estimating ICEs allows for a more precise and clear understanding of causal inference, reconciles the difference between what the researcher is interested in and what the researcher estimates, allows the researcher to explore and discover treatment effect heterogeneity, bridges the quantitative-qualitative divide, and allows for easy estimation of any other causal estimand. / Government
|
2 |
Propensity Score for Causal Inference of Multiple and Multivalued TreatmentsGu, Zirui 01 January 2016 (has links)
Propensity score methods (PSM) that have been widely used to reduce selection bias in observational studies are restricted to a binary treatment. Imai and van Dyk extended PSM to estimate non-binary treatment effect using stratification with P-Function, and generalized inverse treatment probability weighting (GIPTW). However, propensity score (PS) matching methods on multiple treatments received little attention, and existing generalized PSMs merely focused on estimates of main treatment effects but omitted potential interaction effects that are of essential interest in many studies. In this dissertation, I extend Rubin’s PS matching theory to general treatment regimens under the P-Function framework. From theory to practice, I propose an innovative distance measure that can summarize similarities among subjects in multiple treatment groups. Based on this distance measure I propose four generalized propensity score matching methodologies. The first two methods are extensions of nearest neighbor matching. I implemented Monte Carlo simulation studies to compare them with GIPTW and stratification on P-Function methods. The next two methods are extensions of the nearest neighbor caliper width matching and variable matching. I define the caliper width as the product of a weighted standard deviation of all possible pairwise distances between two treatment groups. I conduct a series of simulation studies to determine an optimal caliper width by searching the lowest mean square error of average causal interaction effect. I further compare the ones with optimal caliper width with other methods using simulations. Finally, I apply these methods to the National Medical Expenditure Survey data to examine the average causal main effect of duration and frequency of smoking as well as their interaction effect on annual medical expenditures. Using proposed methods, researchers can apply regression models with specified interaction terms to the matched data and simultaneously obtain both main and interaction effects estimate with improved statistical properties.
|
3 |
Applications of machine learning to agricultural land values: prediction and causal inferenceEr, Emrah January 1900 (has links)
Doctor of Philosophy / Department of Agricultural Economics / Nathan P. Hendricks / This dissertation focuses on the prediction of agricultural land values and the effects of water rights on land values using machine learning algorithms and hedonic pricing methods. I predict agricultural land values with different machine learning algorithms, including ridge regression, least absolute shrinkage and selection operator, random forests, and extreme gradient boosting methods. To analyze the causal effects of water right seniority on agricultural land values, I use the double-selection LASSO technique.
The second chapter presents the data used in the dissertation. A unique set of parcel sales from Property Valuation Division of Kansas constitute the backbone of the data used in the estimation. Along with parcel sales data, I collected detailed basis, water, tax, soil, weather, and urban influence data. This chapter provides detailed explanation of various data sources and variable construction processes.
The third chapter presents different machine learning models for irrigated agricultural land price predictions in Kansas. Researchers, and policymakers use different models and data sets for price prediction. Recently developed machine learning methods have the power to improve the predictive ability of the models estimated. In this chapter I estimate several machine learning models for predicting the agricultural land values in Kansas. Results indicate that the predictive power of the machine learning methods are stronger compared to standard econometric methods. Median absolute error in extreme gradient boosting estimation is 0.1312 whereas it is 0.6528 in simple OLS model.
The fourth chapter examines whether water right seniority is capitalized into irrigated agricultural land values in Kansas. Using a unique data set of irrigated agricultural land sales, I analyze the causal effect of water right seniority on agricultural land values. A possible concern during the estimation of hedonic models is the omitted variable bias so we use double-selection LASSO regression and its variable selection properties to overcome the omitted variable bias. I also estimate generalized additive models to analyze the nonlinearities that may exist. Results show that water rights have a positive impact on irrigated land prices in Kansas. An additional year of water right seniority causes irrigated land value to increase nearly $17 per acre. Further analysis also suggest a nonlinear relationship between seniority and agricultural land prices.
|
4 |
Empirical stadies of online markets: the impact of product page cues on consumer decisionsBanerjee, Shrabastee 14 May 2021 (has links)
The widespread expansion of online markets in the past decade poses several questions for platforms, firms and customers alike. An important dimension to be explored in this domain is the provision of information on e-commerce platforms - given the increasing ease with which product pages can be customized to include a vast variety of content, how do these pieces of information interact? Further, what are the specific channels through which this information eventually influences consumer decision-making? My dissertation is situated in this space, and aims to look at how consumers respond to various “cues” that are being introduced by e-commerce platforms which offer products or services that can be purchased online, and how these cues might eventually influence decision-making. In my first dissertation project, the cue I focus on is user generated content. More specifically, I study how the introduction of the Q&A technology (which enables customers to ask product-specific questions before purchase, and receive answers either from other customers or the platform itself) affects the more widely established reviews and ratings feature
on e-commerce platforms. I find that the addition of Q&As leads to better matches between customers and products, higher customer satisfaction, and resultantly higher ratings. My second project examines another cue that is common in online markets, which is the advertised reference price. My goal in this project is to examine how users react to a specific variant of such prices, namely the “Starting from...” price, using data from a large scale field experiment conducted on Holidu.com. My results indicate that raising “From” prices gives users a more accurate price estimate, but it negatively impacts outbound clicks and other engagement metrics. Taken together, the two projects aim to shed light on factors that influence consumer decision-making in an e-commerce setting, and the possible mechanisms underlying this influence.
|
5 |
Precision improvement for Mendelian RandomizationZhu, Yineng 23 January 2023 (has links)
Mendelian Randomization (MR) methods use genetic variants as instrumental variables (IV) to infer causal relationships between an exposure and an outcome, which overcomes the inability to infer such a relationship in observational studies due to unobserved confounders. There are several MR methods, including the inverse variance weighted (IVW) method, which has been extended to deal with correlated IVs; the median method, which provides consistent causal estimates in the presence of pleiotropy when less than half of the genetic variants are invalid IVs but assumes independent IVs. In this dissertation, we propose two new methods to improve precision for MR analysis. In the first chapter, we extend the median method to correlated IVs: the quasi-boots median method, that accounts for IV correlation in the standard error estimation using a quasi-bootstrap method. Simulation studies show that this method outperforms existing median methods under the correlated IVs setting with and without the presence of pleiotropic effects. In the second chapter, to overcome the lack of an effective solution to account for sample overlap in current IVW methods, we propose a new overall causal effect estimator by exploring the distribution of the estimator for individual IVs under the independent IVs setting, which we name the IVW-GH method. In the final chapter, we extend the IVW-GH method to correlated IVs. In simulation studies, the IVW-GH method outperforms the existing IVW methods under the one-sample setting for independent IVs and shows reasonable results for other settings. We apply these proposed methods to genome-wide association results from the Framingham Heart Study Offspring Study and the Million Veteran Program to identify potential causal relationships between a number of proteins and lipids. All the proposed methods are able to identify some proteins known to be related to lipids. In addition, the quasi-boots median method is robust to pleiotropic effects in the real data application. Consequently, the newly proposed quasi-boots median method and IVW-GH method may provide additional insights for identifying causal relationships. / 2025-01-23T00:00:00Z
|
6 |
A conditional view of causalityWeinert, Friedel January 2007 (has links)
No / Causal inference is perhaps the most important form of reasoning in the sciences. A panoply of disciplines, ranging from epidemiology to biology, from econometrics to physics, make use of probability and statistics to infer causal relationships. The social and health sciences analyse population-level data using statistical methods to infer average causal relations. In diagnosis of disease, probabilistic statements are based on population-level causal knowledge combined with knowledge of a particular person¿s symptoms. For the physical sciences, the Salmon-Dowe account develops an analysis of causation based on the notion of process and interaction. In artificial intelligence, the development of graphical methods has leant impetus to a probabilistic analysis of causality. The biological sciences use probabilistic methods to look for evolutionary causes of the state of a current species and to look for genetic causal factors. This variegated situation raises at least two fundamental philosophical issues: about the relation between causality and probability, and about the interpretation of probability in causal analysis.
In this book we bring philosophers and scientists together to discuss the relation between causality and probability, and the applications of these concepts within the sciences.
|
7 |
Causal Network ANOVA and Tree Model ExplainabilityZhongli Jiang (18848698) 24 June 2024 (has links)
<p dir="ltr"><i>In this dissertation, we present research results on two independent projects, one on </i><i>analysis of variance of multiple causal networks and the other on feature-specific coefficients </i><i>of determination in tree ensembles.</i></p>
|
8 |
Bayesian Mixture Modeling Approaches for Intermediate Variables and Causal InferenceSchwartz, Scott Lee January 2010 (has links)
<p>This thesis examines causal inference related topics involving intermediate variables, and uses Bayesian methodologies to advance analysis capabilities in these areas. First, joint modeling of outcome variables with intermediate variables is considered in the context of birthweight and censored gestational age analyses. The proposed methodology provides improved inference capabilities for birthweight and gestational age, avoids post-treatment selection bias problems associated with conditional on gestational age analyses, and appropriately assesses the uncertainty associated with censored gestational age. Second, principal stratification methodology for settings where causal inference analysis requires appropriate adjustment of intermediate variables is extended to observational settings with binary treatments and binary intermediate variables. This is done by uncovering the structural pathways of unmeasured confounding affecting principal stratification analysis and directly incorporating them into a model based sensitivity analysis methodology. Demonstration focuses on a study of the efficacy of influenza vaccination in elderly populations. Third, flexibility, interpretability, and capability of principal stratification analyses for continuous intermediate variables are improved by replacing the current fully parametric methodologies with semiparametric Bayesian alternatives. This presentation is one of the first uses of nonparametric techniques in causal inference analysis,</p><p>and opens a connection between these two fields. Demonstration focuses on two studies, one involving a cholesterol reduction drug, and one examine the effect of physical activity on cardiovascular disease as it relates to body mass index.</p> / Dissertation
|
9 |
Sensitivity Analysis of Untestable Assumptions in Causal InferenceLundin, Mathias January 2011 (has links)
This thesis contributes to the research field of causal inference, where the effect of a treatment on an outcome is of interest is concerned. Many such effects cannot be estimated through randomised experiments. For example, the effect of higher education on future income needs to be estimated using observational data. In the estimation, assumptions are made to make individuals that get higher education comparable with those not getting higher education, to make the effect estimable. Another assumption often made in causal inference (both in randomised an nonrandomised studies) is that the treatment received by one individual has no effect on the outcome of others. If this assumption is not met, the meaning of the causal effect of the treatment may be unclear. In the first paper the effect of college choice on income is investigated using Swedish register data, by comparing graduates from old and new Swedish universities. A semiparametric method of estimation is used, thereby relaxing functional assumptions for the data. One assumption often made in causal inference in observational studies is that individuals in different treatment groups are comparable, given that a set of pretreatment variables have been adjusted for in the analysis. This so called unconfoundedness assumption is in principle not possible to test and, therefore, in the second paper we propose a Bayesian sensitivity analysis of the unconfoundedness assumption. This analysis is then performed on the results from the first paper. In the third paper of the thesis, we study profile likelihood as a tool for semiparametric estimation of a causal effect of a treatment. A semiparametric version of the Bayesian sensitivity analysis of the unconfoundedness assumption proposed in Paper II is also performed using profile likelihood. The last paper of the thesis is concerned with the estimation of direct and indirect causal effects of a treatment where interference between units is present, i.e., where the treatment of one individual affects the outcome of other individuals. We give unbiased estimators of these direct and indirect effects for situations where treatment probabilities vary between individuals. We also illustrate in a simulation study how direct and indirect causal effects can be estimated when treatment probabilities need to be estimated using background information on individuals.
|
10 |
Comparison of Methods for Estimating Longitudinal Indirect EffectsJanuary 2018 (has links)
abstract: Mediation analysis is used to investigate how an independent variable, X, is related to an outcome variable, Y, through a mediator variable, M (MacKinnon, 2008). If X represents a randomized intervention it is difficult to make a cause and effect inference regarding indirect effects without making no unmeasured confounding assumptions using the potential outcomes framework (Holland, 1988; MacKinnon, 2008; Robins & Greenland, 1992; VanderWeele, 2015), using longitudinal data to determine the temporal order of M and Y (MacKinnon, 2008), or both. The goals of this dissertation were to (1) define all indirect and direct effects in a three-wave longitudinal mediation model using the causal mediation formula (Pearl, 2012), (2) analytically compare traditional estimators (ANCOVA, difference score, and residualized change score) to the potential outcomes-defined indirect effects, and (3) use a Monte Carlo simulation to compare the performance of regression and potential outcomes-based methods for estimating longitudinal indirect effects and apply the methods to an empirical dataset. The results of the causal mediation formula revealed the potential outcomes definitions of indirect effects are equivalent to the product of coefficient estimators in a three-wave longitudinal mediation model with linear and additive relations. It was demonstrated with analytical comparisons that the ANCOVA, difference score, and residualized change score models’ estimates of two time-specific indirect effects differ as a function of the respective mediator-outcome relations at each time point. The traditional model that performed the best in terms of the evaluation criteria in the Monte Carlo study was the ANCOVA model and the potential outcomes model that performed the best in terms of the evaluation criteria was sequential G-estimation. Implications and future directions are discussed. / Dissertation/Thesis / Doctoral Dissertation Psychology 2018
|
Page generated in 0.0726 seconds