31 |
Probabilistic causal processesKatz, Jonathan Richard January 1982 (has links)
Many theorists take causality to be the heart of scientific explanation. If we couple this view with the idea, again common enough, that causes are something like sufficient conditions for their effects, then we have raised a difficulty for any theory of scientific explanation that admits "final" (genuine) statistical explanations, for a statistical explanation must allow that factors relevant to an event to be explained need not determine that event, but only make it probable to some degree (other than zero or one). If we do not wish to give up the possibility of genuine statistical explanations we must either (1) give up this central causal intuition in explanation, or (2) admit that causes act as something other than sufficient conditions — say in a probabilistic manner. This is, of course, little help unless we can clearly explicate the notion of a probabilistic cause, one of the tasks I have attempted to accomplish in this thesis.
The general structure of this effort is in three parts. In the first chapter I generate the basic concepts underlying the notion of a probabilistic cause. In this I follow closely the development of this idea as constructed by Wesley Salmon. Within this view scientific explanation consists in the tracing of causal influence via causes as processes, and in providing the probability relations among events which are the product of causal interactions. In the second chapter I further develop the idea of causal processes by comparing this notion with the more traditional analyses of causation, the regularity and counterfactual theories. There I show how the process ontology very naturally overcomes problems in both of these views. I also note the advantages this conception of causes holds over these
received views with respect to the problem of determinism. In the third and final chapter I discuss, and attempt to overcome, difficulties that are specific to the notion of a probabilistic causal process. / Arts, Faculty of / Philosophy, Department of / Graduate
|
32 |
A theory of physical probabilityJohns, Richard 05 1900 (has links)
It is now common to hold that causes do not always (and perhaps never) determine their
effects, and indeed theories of "probabilistic causation" abound. The basic idea of these
theories is that C causes E just in case C and E both occur, and the chance of E would have
been lower than it is had C not occurred. The problems with these accounts are that (i) the
notion of chance remains primitive, and (ii) this account of causation does not coincide with the
intuitive notion of causation as ontological support.
Turning things around, I offer an analysis of chance in terms of causation, called the
causal theory of chance. The chance of an event E is the degree to which it is determined by its
causes. Thus chance events have full causal pedigrees, just like determined events; they are not
"events from nowhere". I hold that, for stochastic as well as for deterministic processes, the
actual history of a system is caused by its dynamical properties (represented by the lagrangian)
and the boundary condition. A system is stochastic if (a description of) the actual history is not
fully determined by maximal knowledge of these causes, i.e. it is not logically entailed by them.
If chance involves partial determination, and determination is logical entailment, then
there must be such a thing as partial entailment, or logical probability. To make the notion of
logical probability plausible, in the face of current opposition to it, I offer a new account of
logical probability which meets objections levelled at the previous accounts of Keynes and
Carnap.
The causal theory of chance, unlike its competitors, satisfies all of the following criteria:
(i) Chance is defined for single events.
(ii) Chance supervenes on the physical properties of the system in question.
(iii) Chance is a probability function, i.e. a normalised measure.
(iv) Knowledge of the chance of an event warrants a numerically equal degree of belief, i.e.
Miller's Principle can be derived within the theory.
(v) Chance is empirically accessible, within any given range of error, by measuring relative
frequencies.
(vi) With an additional assumption, the theory entails Reichenbach's Common Cause Principle
(CCP).
(vii) The theory enables us to make sense of probabilities in quantum mechanics.
The assumption used to prove the CCP is that the state of a system represents complete
information, so that the state of a composite system "factorises" into a logical conjunction of
states for the sub-systems. To make sense of quantum mechanics, particularly the EPR
experiment, we drop this assumption. In this case, the EPR criterion of reality is false. It states
that if an event E is predictable, and locally caused, then it is locally predictable. This fails
when maximal information about a pair of systems does not factorise, leading to a non-locality
of knowledge. / Arts, Faculty of / Philosophy, Department of / Graduate
|
33 |
Revealing Sparse Signals in Functional DataUnknown Date (has links)
My dissertation presents a novel statistical method to estimate a sparse signal in functional data and to construct confidence bands for the signal. Existing methods for inference for the mean function in this framework include smoothing splines and kernel estimates. Our methodology involves thresholding a least squares estimator, and the threshold level depends on the sources of variability that exist in this type of data. The proposed estimation method and the confidence bands successfully adapt to the sparsity of the signal. We present supporting evidence through simulations and applications to real datasets. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.. / Fall Semester, 2008. / October 24, 2008. / Sparse Signal, Functional Data Analysis / Includes bibliographical references. / Florentina Bunea, Professor Co-Directing Dissertation; Marten Wegkamp, Professor Co-Directing Dissertation; Joshua Gert, Outside Committee Member; Xufeng Niu, Committee Member; Myles Hollander, Committee Member.
|
34 |
Modeling Differential Item Functioning (DIF) Using Multilevel Logistic Regression Models: A Bayesian PerspectiveUnknown Date (has links)
A multilevel logistic regression approach provides an attractive and practical alternative for the study of Differential Item Functioning (DIF). It is not only useful for identifying items with DIF but also for explaining the presence of DIF. Kamata and Binici (2003) first attempted to identify group unit characteristic variables explaining the variation of DIF by using hierarchical generalized linear models. Their models were implemented by the HLM-5 software, which uses the penalized or predictive quasi-likelihood (PQL) method. They found that the variance estimates produced by HLM-5 for the level 3 parameters are substantially negatively biased. This study extends their work by using a Bayesian approach to obtain more accurate parameter estimates. Two different approaches to modeling the DIF will be presented. These are referred to as the relative and mixture distribution approach, respectively. The relative approach measures the DIF of a particular item relative to the mean overall DIF for all items in the test. The mixture distribution approach treats the DIF as independent values drawn from a distribution which is a mixture of a normal distribution and a discrete distribution concentrated at zero. A simulation study is presented to assess the adequacy of the proposed models. This work also describes and studies models which allow the DIF to vary at level 3 (from school to school). In an example using real data, it is shown how the models can be applied to the identification of items with DIF and the explanation of the source of the DIF. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester, 2005. / April 1, 2005. / Bayesian, Multilevel Logistic Regression, DIF / Includes bibliographical references. / Fred W. Huffer, Professor Co-Directing Thesis; Akihito Kamata, Professor Co-Directing Thesis; Richard Tate, Outside Committee Member; Xu-Feng Niu, Committee Member; Daniel McGee, Committee Member.
|
35 |
Time Scales in Epidemiological AnalysisUnknown Date (has links)
The Cox proportional hazards model is routinely used to determine the time until an event of interest. Two time scales are used in practice: follow up time and chronological age. The former is the most frequently used time scale both in clinical studies and longitudinal observational studies. However, there is no general consensus about which time scale is the best. In recent years, papers have appeared arguing for using chronological age as the time scale either with or without adjusting the entry-age. Also, it has been asserted that if the cumulative baseline hazard is exponential or if the age-at-entry is independent of covariate, the two models are equivalent. Our studies do not satisfy these two conditions in general. We found that the true factor that makes the models perform significantly different is the variability in the age-at-entry. If there is no variability in the entry-age, time scales do not matter and both models estimate exactly the same coefficients. As the variability increases the models disagree with each other. We also computed the optimum time scale proposed by Oakes and utilized them for the Cox model. Both of our empirical and simulation studies show that follow up time scale model using age at entry as a covariate is better than the chronological age and Oakes time scale models. This finding is illustrated with two examples with data from Diverse Population Collaboration. Based on our findings, we recommend using follow up time as a time scale for epidemiological analysis. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Fall Semester, 2009. / August 14, 2009. / Time Scale, Baseline Age, Time-to-event, Left Truncation, Profile Likelihood, Baseline Hazard / Includes bibliographical references. / Daniel L. McGee, Sr., Professor Co-Directing Dissertation; Eric Chicken, Professor Co-Directing Dissertation; Elwood Carlson, Outside Committee Member; Debajyoti Sinha, Committee Member.
|
36 |
New Semiparametric Methods for Recurrent Events DataUnknown Date (has links)
Recurrent events data are rising in all areas of biomedical research. We present a model for recurrent events data with the same link for the intensity and mean functions. Simple interpretations of the covariate effects on both the intensity and mean functions lead to a better understanding of the covariate effects on the recurrent events process. We use partial likelihood and empirical Bayes methods for inference and provide theoretical justifications and as well as relationships between these methods. We also show the asymptotic properties of the empirical Bayes estimators. We illustrate the computational convenience and implementation of our methods with the analysis of a heart transplant study. We also propose an additive regression model and associated empirical Bayes method for the risk of a new event given the history of the recurrent events. Both the cumulative mean and rate functions have closed form expressions for our model. Our inference method for the simiparametric model is based on maximizing a finite dimensional integrated likelihood obtained by integrating over the nonparametric cumulative baseline hazard function. Our method can accommodate time-varying covariates and is easier to implement computationally instead of iterative algorithm based full Bayes methods. The asymptotic properties of our estimates give the large-sample justifications from a frequentist stand point. We apply our method on a study of heart transplant patients to illustrate the computational convenience and other advantages of our method. / A Thesis submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester, 2011. / March 25, 2011. / Additive Intensity Model, Cumulative Mean Function, Intensity Function, Empirical Bayes Methods, Recurrent Events Data / Includes bibliographical references. / Debajyoti Sinha, Professor Directing Thesis; Isaac W. Eberstein, University Representative; Dan McGee, Committee Member; Xufeng Niu, Committee Member.
|
37 |
Quasi-3D Statistical Inversion of Oceanographic Tracer DataUnknown Date (has links)
We perform a quasi-3D Bayesian inversion of oceanographic tracer data from the South Atlantic Ocean. Initially we are considering one active neutral density layer with an upper and lower boundary. The available hydrographic data is linked to model parameters (water velocities, diffusion coefficients) via a 3D advection-diffusion equation. A robust solution to the inverse problem considered can be attained by introducing prior information about parameters and modeling the observation error. This approach estimates both horizontal and vertical flow as well as diffusion coefficients. We find a system of alternating zonal jets at the depths of the North Atlantic Deep Water, consistent with direct measurements of flow and concentration maps. A uniqueness analysis of our model is performed in terms of the oxygen consumption rate. The vertical mixing coefficient bears some relation to the bottom topography even though we do not incorporate that into our model. We extend the method to a multi-layer model, using thermal wind relations weakly in a local fashion (as opposed to integrating the entire water column) to connect layers vertically. Results suggest that the estimated deep zonal jets extend vertically, with a clear depth dependent structure. The vertical structure of the flow field is modified by the tracer fields over that set a priori by thermal wind. Our estimates are consistent with observed flow at the depths of the Antarctic Intermediate Water; at still shallower depths, above the layers considered here, the subtropical gyre is a significant feature of the horizontal flow. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Summer Semester, 2006. / May 12, 2006. / Inverse Problem, Bayesian Approach, Advection-Diffusion, Thermal Wind / Includes bibliographical references. / Kevin Speer, Professor Co-Directing Dissertation; Marten Wegkamp, Professor Co-Directing Dissertation; Louis St. Laurent, Outside Committee Member; Fred Huffer, Committee Member; Xufeng Niu, Committee Member.
|
38 |
Bayesian Dynamic Survival Models for Longitudinal Aging DataUnknown Date (has links)
In this study, we will examine the Bayesian Dynamic Survival Models, time-varying coefficients models from a Bayesian perspective, and their applications in the aging setting. The specific questions we are interested in are: Do the relative importance of characteristics measured at a particular age, such as blood pressure, smoking, and body weight, with respect to heart diseases or death change as people age? If they do, how can we model the change? And, how does the change affect the analysis results if fixed-effect models are applied? In the epidemiological and statistical literature, the relationship between a risk factor and the risk of an event is often described in terms of the numerical contribution of the risk factor to the total risk within a follow-up period, using methods such as contingency tables and logistic regression models. With the development of survival analysis, another method named the Proportional Hazards Model becomes more popular. This model describes the relationship between a covariate and risk within a follow-up period as a process, under the assumption that the hazard ratio of the covariate is fixed during the follow-up period. Neither previous methods nor the Proportional Hazards Model allows the effect of a covariates to change flexibly with time. In these study, we intend to investigate some classic epidemiological relationships using appropriate methods that allow coefficients to change with time, and compare our results with those found in the literature. After describing what has been done in previous work based on multiple logistic regression or discriminant function analysis, we summarize different methods for estimating the time varying coefficient survival models that are developed specifically for the situations under which the proportional hazards assumption is violated. We will focus on the Bayesian Dynamic Survival Model because its flexibility and Bayesian structure fits our study goals. There are two estimation methods for the Bayesian Dynamic Survival Models, the Linear Bayesian Estimation (LBE) method and the Markov Chain Monte Carlo (MCMC) sampling method. The LBE method is simpler, faster, and more flexible to calculate, but it requires specifications of some parameters that usually are unknown. The MCMC method gets around the difficulty of specifying parameters, but is much more computationally intensive. We will use a simulation study to investigate the performances of these two methods, and provide suggestions on how to use them effectively in application. The Bayesian Dynamic Survival Model is applied to the Framingham Heart Study to investigate the time-varying effects of covariates such as gender, age, smoking, and SBP (Systolic Blood Pressure) with respect to death. We also examined the changing relationship between BMI (Body Mass Index) and all-cause mortality, and suggested that some of the heterogeneity observed in the results found in the literature is likely to be a consequence of using fixed effect models to describe a time-varying relationship. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Summer Semester, 2007. / May 7, 2005. / Bayesian Analysis, Time-Varying Coefficient Model, Survival Analysis / Includes bibliographical references. / Daniel L. McGee, Professor Co-Directing Dissertation; Xufeng Niu, Professor Co-Directing Dissertation; Suzanne B. Johnson, Outside Committee Member; Fred W. Huffer, Committee Member.
|
39 |
Investigating the Categories for Cholesterol and Blood Pressure for Risk Assessment of Death Due to Coronary Heart DiseaseUnknown Date (has links)
Many characteristics for predicting death due to coronary heart disease are measured on a continuous scale. These characteristics, however, are often categorized for clinical use and to aid in treatment decisions. We would like to derive a systematic approach to determine the best categorizations of systolic blood pressure and cholesterol level for use in identifying individuals who are at high risk for death due to coronary heart disease and to compare these data derived categories to those in common usage. Whatever categories are chosen, they should allow physicians to accurately estimate the probability of survival from coronary heart disease until some time t. The best categories will be those that provide the most accurate prediction for an individual's risk of dying by t. The approach that will be used to determine these categories will be a version of Classification And Regression Trees that can be applied to censored survival data. The major goals of this dissertation are to obtain data-derived categories for risk assessment, compare these categories to the ones already recommended in the medical community, and to assess the performance of these categories in predicting survival probabilities. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Fall Semester, 2005. / September 9, 2005. / Kernel Density Estimate, Coronary Heart Disease, CART, Relative Risk Trees, Tree Comparison / Includes bibliographical references. / Daniel McGee, Sr., Professor Directing Dissertation; Myra Hurt, Outside Committee Member; Fred Huffer, Committee Member; Xufeng Niu, Committee Member.
|
40 |
Discrimination and Calibration of Prognostic Survival ModelsUnknown Date (has links)
Clinicians employ prognostic survival models for diseases such as coronary heart disease and cancer to inform patients about risks, treatments, and clinical decisions (Altman and Royston 2000). These prognostic models are not useful unless they are valid in the population to which they are applied. There are no generally accepted algorithms for assessing the validity of an external survival model in a new population. Researchers often invoke measures of predictive accuracy, the degree to which predicted outcomes match observed outcomes (Justice et al. 1999). One component of predictive accuracy is discrimination, the ability of the model to correctly rank the individuals in the sample by risk. A common measure of discrimination for prognostic survival models is the concordance index, also called the c-statistic. We utilize the concordance index to determine the discrimination of Framingham-based Cox and Log-logistic models of coronary heart disease (CHD) death in cohorts from the Diverse Populations Collaboration, a collection of studies that encompasses many ethnic, geographic, and socioeconomic groups. Pencina and D'Agostino presented a confidence interval for the concordance index when assessing the discrimination of an external prognostic model. We perform simulations to determine the robustness of their confidence interval when measuring discrimination during internal validation. The Pencina and D'Agostino confidence interval is not valid in the internal validation setting because their assumption of mutually independent observations is violated. We compare the Pencina and D'Agostino confidence interval to a bootstrap confidence interval that we propose that is valid for the internal validation. We specifically discern the performance of the interval when the same sample is used to both fit and determine the validity of a prognostic model. The framework for our simulations is a Weibull proportional hazards model of CHD death fit to the Framingham exam 4 data. We then focus on the second component of accuracy, calibration, which measures the agreement between the observed and predicted event rates for groups of patients (Altman and Royston 2000). In 2000, van Houwelingen introduced a method called validation by calibration to allow a clinician to assess the validity of a well-accepted published survival model on his/her own patient population and adjust the published model to fit that population. Van Houwelingen embeds the published model into a new model with only 3 parameters which helps combat the overfitting that occurs when models with many covariates are fit on data sets with a small number of events. We explore validation by calibration as a tool to adjust models when an external model over- or underestimates risk. Van Houwelingen discusses the general method and then focusses on the proportional hazards model. There are situations where proportional hazards may not hold, thus we extend the methodology to the Log-logistic accelerated failure time model. We perform validation by calibration of Framingham-based Cox and Log-logistic models of CHD death to cohorts from the Diverse Populations Collaboration. Lastly, we conduct simulations that investigate the power of the global Wald validation by calibration test. We study its power to reject an invalid proportional hazards or Log-logistic accelerated failure time model under various scale and/or shape misspecifications. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of
Doctor of Philosophy. / Spring Semester, 2009. / December 18, 2008. / Transportable, Generalizable, Validation, Framingham, Log-logistic, Reproducible / Includes bibliographical references. / Myles Hollander, Professor Co-Directing Thesis; Daniel McGee, Professor Co-Directing Thesis; Myra Hurt, Outside Committee Member; XuFeng Niu, Committee Member.
|
Page generated in 0.0862 seconds