Global ETD Search

1	Estimating Individual Causal Effects Lam, Patrick Kenneth 18 October 2013 (has links) Most empirical work focuses on the estimation of average treatment effects (ATE). In this dissertation, I argue for a different way of thinking about causal inference by estimating individual causal effects (ICEs). I argue that focusing on estimating ICEs allows for a more precise and clear understanding of causal inference, reconciles the difference between what the researcher is interested in and what the researcher estimates, allows the researcher to explore and discover treatment effect heterogeneity, bridges the quantitative-qualitative divide, and allows for easy estimation of any other causal estimand. / Government Political Science Statistics causal inference
2	Propensity Score for Causal Inference of Multiple and Multivalued Treatments Gu, Zirui 01 January 2016 (has links) Propensity score methods (PSM) that have been widely used to reduce selection bias in observational studies are restricted to a binary treatment. Imai and van Dyk extended PSM to estimate non-binary treatment effect using stratification with P-Function, and generalized inverse treatment probability weighting (GIPTW). However, propensity score (PS) matching methods on multiple treatments received little attention, and existing generalized PSMs merely focused on estimates of main treatment effects but omitted potential interaction effects that are of essential interest in many studies. In this dissertation, I extend Rubin’s PS matching theory to general treatment regimens under the P-Function framework. From theory to practice, I propose an innovative distance measure that can summarize similarities among subjects in multiple treatment groups. Based on this distance measure I propose four generalized propensity score matching methodologies. The first two methods are extensions of nearest neighbor matching. I implemented Monte Carlo simulation studies to compare them with GIPTW and stratification on P-Function methods. The next two methods are extensions of the nearest neighbor caliper width matching and variable matching. I define the caliper width as the product of a weighted standard deviation of all possible pairwise distances between two treatment groups. I conduct a series of simulation studies to determine an optimal caliper width by searching the lowest mean square error of average causal interaction effect. I further compare the ones with optimal caliper width with other methods using simulations. Finally, I apply these methods to the National Medical Expenditure Survey data to examine the average causal main effect of duration and frequency of smoking as well as their interaction effect on annual medical expenditures. Using proposed methods, researchers can apply regression models with specified interaction terms to the matched data and simultaneously obtain both main and interaction effects estimate with improved statistical properties. Read more Propensity Score Causal Inference Multiple and Multivalued Treatments
3	Applications of machine learning to agricultural land values: prediction and causal inference Er, Emrah January 1900 (has links) Doctor of Philosophy / Department of Agricultural Economics / Nathan P. Hendricks / This dissertation focuses on the prediction of agricultural land values and the effects of water rights on land values using machine learning algorithms and hedonic pricing methods. I predict agricultural land values with different machine learning algorithms, including ridge regression, least absolute shrinkage and selection operator, random forests, and extreme gradient boosting methods. To analyze the causal effects of water right seniority on agricultural land values, I use the double-selection LASSO technique. The second chapter presents the data used in the dissertation. A unique set of parcel sales from Property Valuation Division of Kansas constitute the backbone of the data used in the estimation. Along with parcel sales data, I collected detailed basis, water, tax, soil, weather, and urban influence data. This chapter provides detailed explanation of various data sources and variable construction processes. The third chapter presents different machine learning models for irrigated agricultural land price predictions in Kansas. Researchers, and policymakers use different models and data sets for price prediction. Recently developed machine learning methods have the power to improve the predictive ability of the models estimated. In this chapter I estimate several machine learning models for predicting the agricultural land values in Kansas. Results indicate that the predictive power of the machine learning methods are stronger compared to standard econometric methods. Median absolute error in extreme gradient boosting estimation is 0.1312 whereas it is 0.6528 in simple OLS model. The fourth chapter examines whether water right seniority is capitalized into irrigated agricultural land values in Kansas. Using a unique data set of irrigated agricultural land sales, I analyze the causal effect of water right seniority on agricultural land values. A possible concern during the estimation of hedonic models is the omitted variable bias so we use double-selection LASSO regression and its variable selection properties to overcome the omitted variable bias. I also estimate generalized additive models to analyze the nonlinearities that may exist. Results show that water rights have a positive impact on irrigated land prices in Kansas. An additional year of water right seniority causes irrigated land value to increase nearly $17 per acre. Further analysis also suggest a nonlinear relationship between seniority and agricultural land prices. Read more Land Values Machine Learning Prediction Causal Inference
4	Empirical stadies of online markets: the impact of product page cues on consumer decisions Banerjee, Shrabastee 14 May 2021 (has links) The widespread expansion of online markets in the past decade poses several questions for platforms, firms and customers alike. An important dimension to be explored in this domain is the provision of information on e-commerce platforms - given the increasing ease with which product pages can be customized to include a vast variety of content, how do these pieces of information interact? Further, what are the specific channels through which this information eventually influences consumer decision-making? My dissertation is situated in this space, and aims to look at how consumers respond to various “cues” that are being introduced by e-commerce platforms which offer products or services that can be purchased online, and how these cues might eventually influence decision-making. In my first dissertation project, the cue I focus on is user generated content. More specifically, I study how the introduction of the Q&A technology (which enables customers to ask product-specific questions before purchase, and receive answers either from other customers or the platform itself) affects the more widely established reviews and ratings feature on e-commerce platforms. I find that the addition of Q&As leads to better matches between customers and products, higher customer satisfaction, and resultantly higher ratings. My second project examines another cue that is common in online markets, which is the advertised reference price. My goal in this project is to examine how users react to a specific variant of such prices, namely the “Starting from...” price, using data from a large scale field experiment conducted on Holidu.com. My results indicate that raising “From” prices gives users a more accurate price estimate, but it negatively impacts outbound clicks and other engagement metrics. Taken together, the two projects aim to shed light on factors that influence consumer decision-making in an e-commerce setting, and the possible mechanisms underlying this influence. Read more Marketing Causal inference Field experiment Online markets
5	Precision improvement for Mendelian Randomization Zhu, Yineng 23 January 2023 (has links) Mendelian Randomization (MR) methods use genetic variants as instrumental variables (IV) to infer causal relationships between an exposure and an outcome, which overcomes the inability to infer such a relationship in observational studies due to unobserved confounders. There are several MR methods, including the inverse variance weighted (IVW) method, which has been extended to deal with correlated IVs; the median method, which provides consistent causal estimates in the presence of pleiotropy when less than half of the genetic variants are invalid IVs but assumes independent IVs. In this dissertation, we propose two new methods to improve precision for MR analysis. In the first chapter, we extend the median method to correlated IVs: the quasi-boots median method, that accounts for IV correlation in the standard error estimation using a quasi-bootstrap method. Simulation studies show that this method outperforms existing median methods under the correlated IVs setting with and without the presence of pleiotropic effects. In the second chapter, to overcome the lack of an effective solution to account for sample overlap in current IVW methods, we propose a new overall causal effect estimator by exploring the distribution of the estimator for individual IVs under the independent IVs setting, which we name the IVW-GH method. In the final chapter, we extend the IVW-GH method to correlated IVs. In simulation studies, the IVW-GH method outperforms the existing IVW methods under the one-sample setting for independent IVs and shows reasonable results for other settings. We apply these proposed methods to genome-wide association results from the Framingham Heart Study Offspring Study and the Million Veteran Program to identify potential causal relationships between a number of proteins and lipids. All the proposed methods are able to identify some proteins known to be related to lipids. In addition, the quasi-boots median method is robust to pleiotropic effects in the real data application. Consequently, the newly proposed quasi-boots median method and IVW-GH method may provide additional insights for identifying causal relationships. / 2025-01-23T00:00:00Z Read more Biostatistics Causal inference Mendelian Randomization Pleiotropy
6	Causal Network ANOVA and Tree Model Explainability Zhongli Jiang (18848698) 24 June 2024 (has links) <p dir="ltr"><i>In this dissertation, we present research results on two independent projects, one on </i><i>analysis of variance of multiple causal networks and the other on feature-specific coefficients </i><i>of determination in tree ensembles.</i></p> Statistics not elsewhere classified Causal Inference interpretability
7	A conditional view of causality Weinert, Friedel January 2007 (has links) No / Causal inference is perhaps the most important form of reasoning in the sciences. A panoply of disciplines, ranging from epidemiology to biology, from econometrics to physics, make use of probability and statistics to infer causal relationships. The social and health sciences analyse population-level data using statistical methods to infer average causal relations. In diagnosis of disease, probabilistic statements are based on population-level causal knowledge combined with knowledge of a particular person¿s symptoms. For the physical sciences, the Salmon-Dowe account develops an analysis of causation based on the notion of process and interaction. In artificial intelligence, the development of graphical methods has leant impetus to a probabilistic analysis of causality. The biological sciences use probabilistic methods to look for evolutionary causes of the state of a current species and to look for genetic causal factors. This variegated situation raises at least two fundamental philosophical issues: about the relation between causality and probability, and about the interpretation of probability in causal analysis. In this book we bring philosophers and scientists together to discuss the relation between causality and probability, and the applications of these concepts within the sciences. Read more Causality ; Probability ; Causal inference ; Causal relationships
8	<b>STOCHASTIC NEURAL NETWORK AND CAUSAL INFERENCE</b> Yaxin Fang (17069563) 10 January 2025 (has links) <p dir="ltr">Estimating causal effects from observational data has been challenging due to high-dimensional complex dataset and confounding biases. In this thesis, we try to tackle these issues by leveraging deep learning techniques, including sparse deep learning and stochastic neural networks, that have been developed in recent literature. </p><p dir="ltr">With the advancement of data science, the collection of increasingly complex datasets has become commonplace. In such datasets, the data dimension can be extremely high, and the underlying data generation process can be unknown and highly nonlinear. As a result, the task of making causal inference with high-dimensional complex data has become a fundamental problem in many disciplines, such as medicine, econometrics, and social science. However, the existing methods for causal inference are frequently developed under the assumption that the data dimension is low or that the underlying data generation process is linear or approximately linear. To address these challenges, chapter 3 proposes a novel causal inference approach for dealing with high-dimensional complex data. By using sparse deep learning techniques, the proposed approach can address both the high dimensionality and unknown data generation process in a coherent way. Furthermore, the proposed approach can also be used when missing values are present in the datasets. Extensive numerical studies indicate that the proposed approach outperforms existing ones. </p><p dir="ltr">One of the major challenges in causal inference with observational data is handling missing confounder. Latent variable modeling is a valid framework to address this challenge, but current approaches within the framework often suffer from consistency issues in causal effect estimation and are hard to extend to more complex application scenarios. To bridge this gap, in chapter 4, we propose a new latent variable modeling approach. It utilizes a stochastic neural network, where the latent variables are imputed as the outputs of hidden neurons using an adaptive stochastic gradient HMC algorithm. Causal inference is then conducted based on the imputed latent variables. Under mild conditions, the new approach provides a theoretical guarantee for the consistency of causal effect estimation. The new approach also serves as a versatile tool for modeling various causal relationships, leveraging the flexibility of the stochastic neural network in natural process modeling. We show that the new approach matches state-of-the-art performance on benchmarks for causal effect estimation and demonstrate its adaptability to proxy variable and multiple-cause scenarios.</p> Read more Computational statistics causal inference stochastic neural network
9	Bayesian Mixture Modeling Approaches for Intermediate Variables and Causal Inference Schwartz, Scott Lee January 2010 (has links) <p>This thesis examines causal inference related topics involving intermediate variables, and uses Bayesian methodologies to advance analysis capabilities in these areas. First, joint modeling of outcome variables with intermediate variables is considered in the context of birthweight and censored gestational age analyses. The proposed methodology provides improved inference capabilities for birthweight and gestational age, avoids post-treatment selection bias problems associated with conditional on gestational age analyses, and appropriately assesses the uncertainty associated with censored gestational age. Second, principal stratification methodology for settings where causal inference analysis requires appropriate adjustment of intermediate variables is extended to observational settings with binary treatments and binary intermediate variables. This is done by uncovering the structural pathways of unmeasured confounding affecting principal stratification analysis and directly incorporating them into a model based sensitivity analysis methodology. Demonstration focuses on a study of the efficacy of influenza vaccination in elderly populations. Third, flexibility, interpretability, and capability of principal stratification analyses for continuous intermediate variables are improved by replacing the current fully parametric methodologies with semiparametric Bayesian alternatives. This presentation is one of the first uses of nonparametric techniques in causal inference analysis,</p><p>and opens a connection between these two fields. Demonstration focuses on two studies, one involving a cholesterol reduction drug, and one examine the effect of physical activity on cardiovascular disease as it relates to body mass index.</p> / Dissertation Read more Statistics Bayesian statistics Causal inference Intermediate variables Principal stratification
10	Sensitivity Analysis of Untestable Assumptions in Causal Inference Lundin, Mathias January 2011 (has links) This thesis contributes to the research field of causal inference, where the effect of a treatment on an outcome is of interest is concerned. Many such effects cannot be estimated through randomised experiments. For example, the effect of higher education on future income needs to be estimated using observational data. In the estimation, assumptions are made to make individuals that get higher education comparable with those not getting higher education, to make the effect estimable. Another assumption often made in causal inference (both in randomised an nonrandomised studies) is that the treatment received by one individual has no effect on the outcome of others. If this assumption is not met, the meaning of the causal effect of the treatment may be unclear. In the first paper the effect of college choice on income is investigated using Swedish register data, by comparing graduates from old and new Swedish universities. A semiparametric method of estimation is used, thereby relaxing functional assumptions for the data. One assumption often made in causal inference in observational studies is that individuals in different treatment groups are comparable, given that a set of pretreatment variables have been adjusted for in the analysis. This so called unconfoundedness assumption is in principle not possible to test and, therefore, in the second paper we propose a Bayesian sensitivity analysis of the unconfoundedness assumption. This analysis is then performed on the results from the first paper. In the third paper of the thesis, we study profile likelihood as a tool for semiparametric estimation of a causal effect of a treatment. A semiparametric version of the Bayesian sensitivity analysis of the unconfoundedness assumption proposed in Paper II is also performed using profile likelihood. The last paper of the thesis is concerned with the estimation of direct and indirect causal effects of a treatment where interference between units is present, i.e., where the treatment of one individual affects the outcome of other individuals. We give unbiased estimators of these direct and indirect effects for situations where treatment probabilities vary between individuals. We also illustrate in a simulation study how direct and indirect causal effects can be estimated when treatment probabilities need to be estimated using background information on individuals. Read more Observational studies semiparametric regression unconfoundedness Causal inference Statistics Statistik

Search results