Spelling suggestions: "subject:"estatistics"" "subject:"cstatistics""
381 |
The Craig-Sakamoto Theorem /Dumais, Mylène Fanny. January 2000 (has links)
This thesis reviews the works that most influenced the progress and the development of the Craig-Sakamoto Theorem. This important theorem characterizes the independence of two quadratic forms in normal variables. We begin with a detailed and possibly complete outline of the history of this theorem, as well as several (correct) proofs published over the years. Furthermore, some misleading (or incorrect) proofs are reviewed and their lacunae explained. We conclude with a comprehensive bibliography on the Craig-Sakamoto Theorem; some associated references are also included.
|
382 |
Group comparisons in the presence of length-biased dataAddona, Vittorio. January 2001 (has links)
The effects of length-bias and left-truncation in survival data have been well studied in the statistical literature. To a lesser extent, the phenomena of length-bias and left-truncation have also been investigated when group comparisons are of interest. This literature examines various biases that may occur under different scenarios, and also, on occasion, proposes procedures for the estimation of covariate effects when using prevalent data. In this thesis, we review the literature concerned with the analysis of length-biased and left-truncated data, paying particular attention to the issue of group comparisons. Some shortcomings of the methods developed in the literature are pointed out. We also assess the effects of failure to recognize the presence of length-bias when performing group comparisons in natural history of disease studies. To our knowledge, this issue has not yet been addressed in the literature.
|
383 |
The EM algorithm : an overview with applications to medical dataCoupal, Louis January 1992 (has links)
Owing to their complex design and use of live subjects as experimental units, missing or incomplete data is common place in medical experiments. The great increase in difficulty of maximum likelihood based analysis of incomplete data experiments compared to a similar complete data analysis encourages many medical researchers to ignore cases with missing data in favour of performing a "complete" cases analysis. / The expectation maximization algorithm (EM for short) is often an easily implemented algorithm that provides estimates of parameters in models with missing data. The EM algorithm unifies the theory of maximum likelihood estimation in the context of "missing" data. The general problem of missing data also includes structurally unobservable quantities such as parameters, hyperparameters and latent variables. The nature of its defining steps, the expectation or E-step and the maximization or M-step, gives the user intuitive understanding of the maximization process. / In this Thesis, the EM algorithm is first illustrated through an example borrowed from the field of genetics. The theory of the EM algorithm is formally developed and the special case of exponential families is considered. Issues concerning convergence and inference are discussed. Many examples taken from the medical literature serve to highlight the method's broad spectrum of application in both missing data and unobservable parameter problems.
|
384 |
Some integral equations connected with statistical distribution theoryMunsaka, Melvin Slaighter January 1990 (has links)
This thesis looks at certain integral equations which are used to obtain statistical distributions, particularly in the multivariate case. In all, seven integral equations are considered. Two of the integral equations were introduced by Wilks (1932) and these are known as type A and type B integral equations. The other five known as type C to type G were introduced by Mathai (1984). Some statistical distributions associated with the solutions to these integral equations will also be discussed.
|
385 |
Correlation adjusted penalization in regression analysisTan, Qi Er 25 September 2012 (has links)
The PhD thesis introduces two new types of correlation adjusted penalization methods to address the issue of multicollinearity in regression analysis. The main purpose is to achieve simultaneous shrinkage of parameter estimators and variable selection for multiple linear regression and logistic regression when the predictor variables are highly correlated. The motivation is that when there is serious issue of multicollinearity, the variances of parameter estimators are significantly large. The new correlation adjusted penalization methods shrink the parameter estimators and their variances to alleviate the problem of multicollinearity. The latest important trend to deal with multicollinearity is to apply penalization methods for simultaneous shrinkage and variable selection. In the literature, the following penalization methods are popular: ridge, bridge, LASSO, SCAD, and OSCAR. Few papers have used correlation based penalization methods, and these correlation based methods in the literature do not work when some correlations are either 1 or -1. This means that these correlation based methods fail if at least two predictor variables are perfectly correlated. We introduce two new types of correlation adjusted penalization methods that work whether or not the predictor variables are perfectly correlated. The types of correlation adjusted penalization methods introduced in my thesis are intuitive and innovative. We investigate important theoretical properties of these new types of penalization methods, including bias, mean squared error, data argumentation and asymptotic properties, and plan to apply them to real data sets in the near future.
|
386 |
Detection dans les images IRM d'un signal par la methode des ondelettesAwissi, Madon. January 2000 (has links)
The Magnetic Resonance Imaging technique is the most recent method developed to get brain images. These images can be used to know which brain regions are active during a certain task. Once these regions are found, it is possible to construct the brain functional map. Many studies have been done and are still in process to determine which brain regions answer to different stimuli. The recent wavelet method brings new perspectives for that type of studies. We can use it to decompose an MR image in a set of coefficients that will be studied statistically. This statistical analysis allows us to get rid of the noise included in the MR images. As a result, we get an image on which we can determine the corresponding zone to the brain region which has been activated.
|
387 |
Statistical analysis of a numerical simulation of two-dimensional turbulenceVasiliev, Boris. January 2000 (has links)
An important feature of the time evolution of the vorticity fields in two-dimensional turbulence is the emergence of isolated and coherent structures or vortices. A vortex is a region enclosed by a closed contour, or equivalently a disjoint component, of a sufficiently high value of the vorticity field. / This thesis proposes two estimators of the population of vortices in a vorticity field. They are constructed by discretizing the analytical expression for the number of disjoint components in a contour of a random field derived by Swerling, [13]. One is based on a Riemann sum approximation, while the other employs Monte Carlo integration. The performance of the two estimators is evaluated and it is found that the Monte Carlo based estimator is superior. It is used to estimate the population of vortices from a single simulation of the vorticity equation governing two-dimensional turbulence. Finally, the method is applied to analyze the time evolution of the population of vortices.
|
388 |
Estimation and testing in quantitative linear models with autocorrelated errorsAlpargu, Gülhan. January 2001 (has links)
The efficiency of estimation procedures and the validity of testing procedures in simple and multiple quantitative linear models with autocorrelated errors have been studied in this thesis. The importance of the nature of the explanatory variable(s), fixed and trended versus purely random or following a first-order autoregressive [AR(1)] process, has been emphasized in Monte Carlo studies. The estimation procedures were compared on the basis of different measures of efficiency, relative to OLS or GLS, depending on the context. The estimation procedures studied include the Ordinary Least Squares (OLS), Generalized Least Squares (GLS), estimated GLS, Maximum Likelihood (ML), Restricted Maximum Likelihood (REML), First Differences (FD) and original First-Difference Ratios (FDR). The derived testing procedures were compared on the basis of a condition of strict validity as well as a criterion taking the variability of empirical significance levels into account. / In a preliminary step, the conflicting statements made in the literature concerning estimation in quantitative linear models with autocorrelated errors were sorted out. Unlike the efficiency of estimation procedures, the validity of testing procedures had been studied less extensively before. One of the main results of this thesis is that the more efficient of two estimators does not necessarily provide a more valid testing procedure for the parameter of interest. First, FD and FDR are highly inefficient relative to OLS, but they generally provide a valid test for the combinations of sample size and error autocorrelation parameter considered, whatever the nature of the explanatory variable(s) may be. Second, almost all the testing procedures, including the classical t-test and some modified t-tests of the slope, satisfy the criterion of validity in simple linear regression when the explanatory variable is purely random and the errors follow an AR(1) process. An explanation in terms of effective sample size is given. Third, ML and REML are equally efficient for large sample sizes, and at the same time REML provides a test of the slope that is more valid than the ML testing procedures. These features are illustrated in an application to environmental data.
|
389 |
Validation of tree-structured prediction for censored survival dataNegassa, Abdissa January 1996 (has links)
Objectives. (i) to develop a computationally efficient algorithm of tree-growing for censored survival data, (ii) to assess the performance of two validation schemes, and (iii) to evaluate the performance of computationally inexpensive model selection criteria in relation to cross-validation. / Background. In the tree-growing literature, a number of computationally inexpensive model selection criteria were suggested; however, none of them were systematically investigated for their performance. RECursive Partition and AMalgamation (RECPAM) is one of the existing tree-growing algorithms that provide such built-in model selection criteria. Application of RECPAM's different model selection criteria leads to a wide range of models (40). Since RECPAM is an exploratory data analysis tool, it is desirable to reduce its computational cost and establish the general properties of its model selection criteria so that clear guidelines can be suggested. / Methods. A computationally efficient tree-growing algorithm for prognostic classification and subgroup analysis is developed by employing the Cox score statistic and the Mantel-Haenszel estimator of the relative hazard. Two validation schemes, restricting validation to pruning and parameter estimation and validating the whole process of tree growing, are implemented and evaluated in simulation. Three model selection criteria--the elbow approach, minimum Akaike Information Criterion (AIC), and the one standard error (ISE) rule--were compared to cross-validation under a broad range of scenarios using simulation. Examples of medical data analyses are presented. / Conclusions. A gain in computational efficiency is achieved while obtaining the same result as the original RECPAM approach. The restricted validation scheme is computationally less expensive, however, it is biased. In the case of subgroup analysis, to adjust properly for influential prognostic factors, we suggest constructing a prognostic classification on such factors and using the resulting classification as strata in conducting the subgroup analysis. None of the model selection criteria studied exhibit a consistently superior performance over the range of scenarios considered here. Therefore, we propose a two-stage model selection strategy in which cross-validation is employed at the first step, and if according to this step there is evidence of structure in the data set, then the elbow rule is recommended in the second step.
|
390 |
Statistical analysis of DNA profilesMcClelland, Robyn L. (Robyn Leagh) January 1994 (has links)
DNA profiles have become an extremely important tool in forensic investigations, and a match between a suspect and a crime scene specimen is highly incriminating. Presentation of this evidence in court, however, requires a statistical interpretation, one which reflects the uncertainty in the results due to measurement imprecision and sampling variability. No consensus has been reached about how to quantify this uncertainty, and the literature to date is lacking an objective review of possible methods. / This thesis provides a survey of approaches to statistical analysis of DNA profile data currently in use, as well as proposed methods which seem promising. A comparison of frequentist and Bayesian approaches is made, as well as a careful examination of the assumptions required for each method.
|
Page generated in 0.0789 seconds