Spelling suggestions: "subject:"estatistics"" "subject:"cstatistics""
411 |
A case of scientific fraud? : a statistical approachBohossian, Nora. January 2006 (has links)
In 1986 Thereza Imanishi-Kari, then an assistant professor at the Massachusetts Institute of Technology, was at the peak of her career. She had just coauthored a paper in the prestigious journal Cell with David Baltimore, a Nobel laureate. Their research was exciting and their findings promising. / Margot O'Toole, Imanishi-Kari's postdoctoral fellow at the time, was unable to reproduce some of the experimental results published in the paper and could not resolve this with her postdoctoral supervisor. Subsequently, O'Toole became convinced that there were serious errors in the paper and, shortly afterwards, the National Institutes of Health began officially investigating the questions she raised about it. / It may have been simply a character clash between Imanishi-Kari and O'Toole but partly due to the involvement of a figure such as Baltimore, this clash possibly ruined their careers, took 10 years to settle down, cost millions of dollars of public money, polarized the scientific community, and went down in history as one of the most widely followed cases of scientific fraud. / Based on statistical, forensic and other evidence, Imanishi-Kari was found guilty of scientific misconduct and banned from receiving public funding for 10 years. This was not the end of the matter, however, because Imanishi-Kari appealed the decision and was later exonerated. / In this thesis, we tell the statistical story by putting forward the statistical arguments that were used against Imanishi-Kari and the counterarguments to them.
|
412 |
Stationarity in a prevalent cohort study with follow-upAddona, Vittorio. January 2005 (has links)
In a prevalent cohort study with follow-up, the incidence process is not directly observed since only the onset times of prevalent cases can be ascertained. Several important consequences follow if one can establish stationarity of the incidence process: (1) The useful epidemiological relationship between prevalence, incidence, and mean duration holds, (2) There is improved efficiency when estimating the underlying survivor function from a prevalent cohort study with follow-up, (3) The constancy of the incidence rate is established, and (4) The constant incidence rate can be estimated using data from a prevalent cohort study. / We propose a formal test for stationarity using data from a prevalent cohort study with follow-up, and establish new characterizations of stationarity, and of useful types of departure from stationarity. / A dual to the problem of establishing stationarity by comparing the backward and forward recurrence times is addressed. Assuming stationarity of the underlying incidence process, we use the backward and forward recurrence times to verify whether the underlying survival distribution is independent of the date of onset. In doing so, we characterize specific types of dependence of the underlying survival distribution on calendar time. / If the data are consistent with stationarity of the incidence rate, then a natural next step is to estimate the (constant) incidence rate. We derive the nonparametric maximum likelihood estimator of the constant incidence rate, prove that the estimator is weakly consistent, and show how one may construct an asymptotic confidence interval for the incidence rate. One main advantage of our procedure is that it only requires the completion of a single prevalent cohort study with follow-up. / We apply our test for stationarity to data obtained as part of the Canadian Study of Health and Aging to verify that the incidence rate of dementia amongst the elderly in Canada has remained constant. Upon concluding that this constancy is, plausible, we estimate the incidence rate.
|
413 |
Modeling outcome estimates in meta-analysis using fixed and mixed effects linear modelsMansour, Asmaâ. January 1998 (has links)
The main objective of this thesis is to present a quantitative method for modeling data collected from different studies on a same research topic. This quantitative method is called meta-analysis. / The first step of a meta-analysis is the literature search, conducted using computerized and manual search strategies to identify relevant studies. The second step is the data abstraction from different relevant papers. In general, at least two independent raters systematically abstract the information, and interrater reliability check is performed. / The next step is the quantitative analysis of the abstracted data. For this purpose, it is possible to use either fixed or mixed effects linear model. Under the fixed effects model, only the variability due to sampling error is considered. In contrast, under the mixed effects model, an additional random effects variance is being considered. Both, the method of moments and the method of maximum likelihood can be used to estimate the parameters of the model. / Finally, the use of the above mentioned models and methods of estimation is illustrated with a data set on the prognosis of depression in the elderly, made available by Dr. Martin Cole from the Department of Psychiatry at St. Mary's Hospital Center in Montreal.
|
414 |
A comparison of two approaches of adjusting for covariates in nested designs with binary outcomes /Lu, Jinyan. January 1998 (has links)
This thesis compares two different computationally inexpensive methods of adjusting for case-mix in nested designs with binary outcome. One approach, used in some published analyses, is based on 'ecological' variables. These variables are of the distribution of characteristics of all patients of a given physician. Typically means and proportions are used for continuous and categorical variables, respectively. The final physician-level analysis relies then on a multiple linear regression model where the rate of the relevant outcome, among patients of a given physician, is the dependent variable and the independent variables include both ecological variables and physicians' characteristics of interest. / Another innovative approach proposed and investigated in this study employs the ratio of the observed to the expected number of outcomes among patients of a given physician. The first step of the analysis is an estimation of expected probability of the outcome for an individual patient as a function of this patient's own characteristics. The physician level analysis involves regressing the ratio of observed to expected number of outcomes on the relevant physicians' characteristics. (Abstract shortened by UMI.)
|
415 |
Estimating survival from partially observed dataZhang, Xun, 1959- January 2001 (has links)
Often, in cross-sectional-follow-up studies, survival data are obtained from prevalent cases only. This sampling mechanism introduces length-bias. An added difficulty is that in some cases the times of onset cannot be ascertained or are recorded with great uncertainty. Such was the situation in the Canadian Study of Health and Aging, a nation wide study of dementia conducted by Health Canada during 1991 and 1996. This thesis proposes methods to estimate the survival function nonparametrically, when the data are length-biased and only partially observed. By using the "forward recurrence times" only, we show how one can overcome the difficulty caused by missing onset times, while by using the "backward recurrence times" only, one can avoid the cost and effort of follow-up. We illustrate our methods through an application to data derived from the CSHA.
|
416 |
Statistical Inference on Stochastic GraphsHosseinkashi, Yasaman 17 June 2011 (has links)
This thesis considers modelling and applications of random graph processes.
A brief review on contemporary random graph models and a general Birth-Death
model with relevant maximum likelihood inference procedure are provided in chapter
one. The main result in this thesis is the construction of an epidemic model by
embedding a competing hazard model within a stochastic graph process (chapter
2). This model includes both individual characteristics and the population connectivity
pattern in analyzing the infection propagation. The dynamic outdegrees and
indegrees, estimated by the model, provide insight into important epidemiological
concepts such as the reproductive number. A dynamic reproductive number based
on the disease graph process is developed and applied in several simulated and actual
epidemic outbreaks. In addition, graph-based statistical measures are proposed
to quantify the effect of individual characteristics on the disease propagation. The
epidemic model is applied to two real outbreaks: the 2001 foot-and-mouth epidemic
in the United Kingdom (chapter 3) and the 1861 measles outbreak in Hagelloch,
Germany (chapter 4). Both applications provide valuable insight into the behaviour
of infectious disease propagation with di erent connectivity patterns and human
interventions.
|
417 |
A study of Hougaard distributions, Hougaard processes and their applications /Fook Chong, Stéphanie M. C. January 1992 (has links)
This thesis describes an investigation of Hougaard distributions, Hougaard processes and their applications. The aim is to assemble and synthesize known results about the subject, to provide further insight into its theoretical foundations, to extend existing methods and develop some new methods, to discuss and illustrate applications, and finally to motivate other statisticians to make greater use of Hougaard distributions and Hougaard processes in their own investigations. Although the family of Hougaard distributions is relatively unknown, it includes the inverse Gaussian, gamma and positive stable distributions as special cases and these are well known.
|
418 |
Measuring dependence using information gain when data are length-biased and right-censoredBergeron, Pierre-Jérôme January 2003 (has links)
In epidemiologic studies, prevalent cases with a disease are often identified through a cross-sectional study. These cases are then followed for a fixed period of time at the end of which the subjects will either have failed or have been censored. When interest lies in estimating the survival distribution, from onset, of subjects with the disease one must take into account that the survival times of the cases ascertained in such fashion are left truncated. When it is possible to assume that there has not been any epidemic of the disease over the past period of time that covers the onset times of the subjects, one may assume that the disease has stationary incidence. Under such a stationarity assumption, the survival times of the recruited subjects are called "length-biased". Measures of dependence have been extensively treated in statistical literature. In the context of survival analysis with length-biased data the literature is rather poor. It is the purpose of this thesis to establish a measure of dependence for length-biased right-censored observations.
|
419 |
Generalized profiling method and the applications to adaptive penalized smoothing, generalized semiparametric additive models and estimating differential equationsCao, Jiguo. January 2006 (has links)
Many statistical models involve three distinct groups of variables: local or nuisance parameters, global or structural parameters, and complexity parameters. In this thesis, we introduce the generalized profiling method to estimate these statistical models, which treats one group of parameters as an explicit or implicit function of other parameters. The dimensionality of the parameter space is reduced, and the optimization surface becomes smoother. The Newton-Raphson algorithm is applied to estimate these three distinct groups of parameters in three levels of optimization, with the gradients and Hessian matrices written out analytically by the Implicit Function Theorem it' necessary and allowing for different criteria for each level of optimization. Moreover, variances of global parameters are estimated by the Delta method and include the variation coming from complexity parameters. We also propose three applications of the generalized profiling method. / First, penalized smoothing is extended by allowing for a functional smoothing parameter, which is adaptive to the geometry of the underlying curve, which is called adaptive penalized smoothing. In the first level of optimization, the smooth ing coefficients are local parameters, estimated by minimizing sum of squared errors, conditional on the functional smoothing parameter. In the second level, the functional smoothing parameter is a complexity parameter, estimated by minimizing generalized cross-validation (GCV), treating the smoothing coefficients as explicit functions of the functional smoothing parameter. Adaptive penalized smoothing is shown to obtain better estimates for fitting functions and their derivatives. / Next, the generalized semiparametric additive models are estimated by three levels of optimization, allowing response variables in any kind of distribution. In the first level, the nonparametric functional parameters are nuisance parameters, estimated by maximizing the regularized likelihood function, conditional on the linear coefficients and the smoothing parameter. In the second level, the linear coefficients are structural parameters, estimated by maximizing the likelihood function with the nonparametric functional parameters treated as implicit functions of linear coefficients and the smoothing parameter. In the third level, the smoothing parameter is a complexity parameter, estimated by minimizing the approximated GCV with the linear coefficients treated as implicit functions of the smoothing parameter. This method is applied to estimate the generalized semiparametric additive model for the effect of air pollution on the public health. / Finally, parameters in differential equations (DE's) are estimated from noisy data with the generalized profiling method. In the first level of optimization, fitting functions are estimated to approximate DE solutions by penalized smoothing with the penalty term defined by DE's, fixing values of DE parameters. In the second level of optimization, DE parameters are estimated by weighted sum of squared errors, with the smoothing coefficients treated as an implicit function of DE parameters. The effects of the smoothing parameter on DE parameter estimates are explored and the optimization criteria for smoothing parameter selection are discussed. The method is applied to fit the predator-prey dynamic model to biological data, to estimate DE parameters in the HIV dynamic model from clinical trials, and to explore dynamic models for thermal decomposition of alpha-Pinene.
|
420 |
Mixture Modeling, Sparse Covariance Estimation and Parallel Computing in Bayesian AnalysisCron, Andrew January 2012 (has links)
<p>Mixture modeling of continuous data is an extremely effective and popular method for density estimation and clustering. However as the size of the data grows, both in terms of dimension and number of observations, many modeling and computational problems arise. In the Bayesian setting, computational methods for posterior inference become intractable as the number of observations and/or possible clusters gets large. Furthermore, relabeling in sampling methods is increasingly difficult to address as the data gets large. This thesis addresses computational and methodolog- ical solutions to these problems by utilizing modern computational hardware and new methodology. Novel approaches for parsimonious covariance modeling and information sharing across multiple data sets are then built upon these computational improvements.</p><p>Chapter 1 introduces the fundamental modeling approaches in mixture modeling including Dirichlet processes and posterior inference using Gibbs sampling. Chapter 2 describes the utilization of graphical processing units for massive gains in computational performance in both mixture models and general Bayesian modeling. Chapter 3 introduces a new relabeling approach in mixture modeling that can be scaled far beyond current methodology to massive data and high dimensional settings. Chapter 4 generalizes chapters 2 and 3 to the hierarchical Dirichlet process setting to "borrow strength" from multiple studies in classification problems in flow cytometry. Chapter 5 develops a novel approach for sparse covariance estimation using sparse, full rank, orthogonal matrix estimation. These new methods are applied to a mixture modeling with measurement error setting for classification. Finally, Chapter 6 summarizes the work given in this thesis and outlines exciting areas for future research.</p> / Dissertation
|
Page generated in 0.082 seconds