Global ETD Search

1	Predictor Selection in Linear Regression: L1 regularization of a subset of parameters and Comparison of L1 regularization and stepwise selection Hu, Qing 11 May 2007 (has links) Background: Feature selection, also known as variable selection, is a technique that selects a subset from a large collection of possible predictors to improve the prediction accuracy in regression model. First objective of this project is to investigate in what data structure LASSO outperforms forward stepwise method. The second objective is to develop a feature selection method, Feature Selection by L1 Regularization of Subset of Parameters (LRSP), which selects the model by combining prior knowledge of inclusion of some covariates, if any, and the information collected from the data. Mathematically, LRSP minimizes the residual sum of squares subject to the sum of the absolute value of a subset of the coefficients being less than a constant. In this project, LRSP is compared with LASSO, Forward Selection, and Ordinary Least Squares to investigate their relative performance for different data structures. Results: simulation results indicate that for moderate number of small sized effects, forward selection outperforms LASSO in both prediction accuracy and the performance of variable selection when the variance of model error term is smaller, regardless of the correlations among the covariates; forward selection also works better in the performance of variable selection when the variance of error term is larger, but the correlations among the covariates are smaller. LRSP was shown to be an efficient method to deal with the problems when prior knowledge of inclusion of covariates is available, and it can also be applied to problems with nuisance parameters, such as linear discriminant analysis. L1 regularization Lasso Feature selection Covariate selection Regression analysis
2	Covariate selection and propensity score specification in causal inference Waernbaum, Ingeborg January 2008 (has links) <p>This thesis makes contributions to the statistical research field of causal inference in observational studies. The results obtained are directly applicable in many scientific fields where effects of treatments are investigated and yet controlled experiments are difficult or impossible to implement.</p><p>In the first paper we define a partially specified directed acyclic graph (DAG) describing the independence structure of the variables under study. Using the DAG we show that given that unconfoundedness holds we can use the observed data to select minimal sets of covariates to control for. General covariate selection algorithms are proposed to target the defined minimal subsets.</p><p>The results of the first paper are generalized in Paper II to include the presence of unobserved covariates. Morevoer, the identification assumptions from the first paper are relaxed.</p><p>To implement the covariate selection without parametric assumptions we propose in the third paper the use of a model-free variable selection method from the framework of sufficient dimension reduction. By simulation the performance of the proposed selection methods are investigated. Additionally, we study finite sample properties of treatment effect estimators based on the selected covariate sets.</p><p>In paper IV we investigate misspecifications of parametric models of a scalar summary of the covariates, the propensity score. Motivated by common model specification strategies we describe misspecifications of parametric models for which unbiased estimators of the treatment effect are available. Consequences of the misspecification for the efficiency of treatment effect estimators are also studied.</p> Covariate selection graphical models matching observational studies treatment effects unconfoundedness Statistics Statistik
3	Covariate selection and propensity score specification in causal inference Waernbaum, Ingeborg January 2008 (has links) This thesis makes contributions to the statistical research field of causal inference in observational studies. The results obtained are directly applicable in many scientific fields where effects of treatments are investigated and yet controlled experiments are difficult or impossible to implement. In the first paper we define a partially specified directed acyclic graph (DAG) describing the independence structure of the variables under study. Using the DAG we show that given that unconfoundedness holds we can use the observed data to select minimal sets of covariates to control for. General covariate selection algorithms are proposed to target the defined minimal subsets. The results of the first paper are generalized in Paper II to include the presence of unobserved covariates. Morevoer, the identification assumptions from the first paper are relaxed. To implement the covariate selection without parametric assumptions we propose in the third paper the use of a model-free variable selection method from the framework of sufficient dimension reduction. By simulation the performance of the proposed selection methods are investigated. Additionally, we study finite sample properties of treatment effect estimators based on the selected covariate sets. In paper IV we investigate misspecifications of parametric models of a scalar summary of the covariates, the propensity score. Motivated by common model specification strategies we describe misspecifications of parametric models for which unbiased estimators of the treatment effect are available. Consequences of the misspecification for the efficiency of treatment effect estimators are also studied. Covariate selection graphical models matching observational studies treatment effects unconfoundedness Statistics Statistik
4	Causal inference and case-control studies with applications related to childhood diabetes / Kausal inferens och fall-kontroll studier med applikationer inom barndiabetes Persson, Emma January 2014 (has links) This thesis contributes to the research area of causal inference, where estimation of the effect of a treatment on an outcome of interest is the main objective. Some aspects of the estimation of average causal effects in observational studies in general, and case-control studies in particular, are explored. An important part of estimating causal effects in an observational study is to control for covariates. The first paper of this thesis concerns the selection of minimal covariate sets sufficient for unconfoundedness of the treatment assignment. A data-driven implementation of two covariate selection algorithms is proposed and evaluated. A common sampling scheme in epidemiology, and when investigating rare events, is the case-control design. In the second paper we study estimators of the marginal causal odds ratio in matched and independent case-control designs. Estimators that, under a logistic regression model, utilize information about the known prevalence of being a case is examined and compared through simulations. The third paper investigates the particular situation where case-control sampled data is reused to estimate the effect of the case-defining event on an outcome of interest. The consequence of ignoring the design when estimating the average causal effect is discussed and a design-weighted matching estimator is proposed. The performance of the estimator is evaluated with simulation experiments, when matching on the covariates directly and when matching on the propensity score. The last paper studies the effect of type 1 diabetes mellitus (T1DM) on school achievements using data from the Swedish Childhood Diabetes Register, a population-based incidence register. We apply theoretical results from the second and third papers in the estimation of the average causal effect within the T1DM population. A matching estimator that accounts for the matched case-control design is used. covariate selection design-weighted estimation marginal effect matching register study treatment effect type 1 diabetes mellitus
5	Selection of Sufficient Adjustment Sets for Causal Inference : A Comparison of Algorithms and Evaluation Metrics for Structure Learning Widenfalk, Agnes January 2022 (has links) Causal graphs are essential tools to find sufficient adjustment sets in observational studies. Subject matter experts can sometimes specify these graphs, but often the dependence structure of the variables, and thus the graph, is unknown even to them. In such cases, structure learning algorithms can be used to learn the graph. Early structure learning algorithms were implemented for either exclusively discrete or continuous variables. Recently, methods have been developed for structure learning on mixed data, including both continuous and discrete variables. In this thesis, three structure learning algorithms for mixed data are evaluated through a simulation study. The evaluation is based on graph recovery metrics and the ability to find a sufficient adjustment set for the average treatment effect (ATE). Depending on the intended purpose of the learned graph, the different evaluation metrics should be given varying attention. It is also concluded that the pcalg+micd algorithm learns graphs such that it is possible to find a sufficient adjustment set for the ATE in more than 99% of the cases. Moreover, the learned graphs from pcalg+micd are the most accurate compared to the true graph using the largest sample size. Causal Inference Structure Learning PC-algorithm Covariate Selection Causal Graphs Probability Theory and Statistics Sannolikhetsteori och statistik
6	Covariate Model Building in Nonlinear Mixed Effects Models Ribbing, Jakob January 2007 (has links) <p>Population pharmacokinetic-pharmacodynamic (PK-PD) models can be fitted using nonlinear mixed effects modelling (NONMEM). This is an efficient way of learning about drugs and diseases from data collected in clinical trials. Identifying covariates which explain differences between patients is important to discover patient subpopulations at risk of sub-therapeutic or toxic effects and for treatment individualization. Stepwise covariate modelling (SCM) is commonly used to this end. The aim of the current thesis work was to evaluate SCM and to develop alternative approaches. A further aim was to develop a mechanistic PK-PD model describing fasting plasma glucose, fasting insulin, insulin sensitivity and beta-cell mass.</p><p>The lasso is a penalized estimation method performing covariate selection simultaneously to shrinkage estimation. The lasso was implemented within NONMEM as an alternative to SCM and is discussed in comparison with that method. Further, various ways of incorporating information and propagating knowledge from previous studies into an analysis were investigated. In order to compare the different approaches, investigations were made under varying, replicated conditions. In the course of the investigations, more than one million NONMEM analyses were performed on simulated data. Due to selection bias the use of SCM performed poorly when analysing small datasets or rare subgroups. In these situations, the lasso method in NONMEM performed better, was faster, and additionally validated the covariate model. Alternatively, the performance of SCM can be improved by propagating knowledge or incorporating information from previously analysed studies and by population optimal design.</p><p>A model was also developed on a physiological/mechanistic basis to fit data from three phase II/III studies on the investigational drug, tesaglitazar. This model described fasting glucose and insulin levels well, despite heterogeneous patient groups ranging from non-diabetic insulin resistant subjects to patients with advanced diabetes. The model predictions of beta-cell mass and insulin sensitivity were well in agreement with values in the literature.</p> Pharmacokinetics/Pharmacotherapy Pharmacokinetics Pharmacodynamics Modeling Covariate selection Stepwise selection Covariate analysis Methodology Model validation Model evaluation Type-2 diabetes Beta-cell function Meta analysis Cross-validation Pharmacometrics ED optimization Farmakokinetik/Farmakoterapi
7	Covariate Model Building in Nonlinear Mixed Effects Models Ribbing, Jakob January 2007 (has links) Population pharmacokinetic-pharmacodynamic (PK-PD) models can be fitted using nonlinear mixed effects modelling (NONMEM). This is an efficient way of learning about drugs and diseases from data collected in clinical trials. Identifying covariates which explain differences between patients is important to discover patient subpopulations at risk of sub-therapeutic or toxic effects and for treatment individualization. Stepwise covariate modelling (SCM) is commonly used to this end. The aim of the current thesis work was to evaluate SCM and to develop alternative approaches. A further aim was to develop a mechanistic PK-PD model describing fasting plasma glucose, fasting insulin, insulin sensitivity and beta-cell mass. The lasso is a penalized estimation method performing covariate selection simultaneously to shrinkage estimation. The lasso was implemented within NONMEM as an alternative to SCM and is discussed in comparison with that method. Further, various ways of incorporating information and propagating knowledge from previous studies into an analysis were investigated. In order to compare the different approaches, investigations were made under varying, replicated conditions. In the course of the investigations, more than one million NONMEM analyses were performed on simulated data. Due to selection bias the use of SCM performed poorly when analysing small datasets or rare subgroups. In these situations, the lasso method in NONMEM performed better, was faster, and additionally validated the covariate model. Alternatively, the performance of SCM can be improved by propagating knowledge or incorporating information from previously analysed studies and by population optimal design. A model was also developed on a physiological/mechanistic basis to fit data from three phase II/III studies on the investigational drug, tesaglitazar. This model described fasting glucose and insulin levels well, despite heterogeneous patient groups ranging from non-diabetic insulin resistant subjects to patients with advanced diabetes. The model predictions of beta-cell mass and insulin sensitivity were well in agreement with values in the literature. Pharmacokinetics/Pharmacotherapy Pharmacokinetics Pharmacodynamics Modeling Covariate selection Stepwise selection Covariate analysis Methodology Model validation Model evaluation Type-2 diabetes Beta-cell function Meta analysis Cross-validation Pharmacometrics ED optimization Farmakokinetik/Farmakoterapi

1

Page generated in 0.1679 seconds