Spelling suggestions: "subject:"amathematical statistics."" "subject:"dmathematical statistics.""
511 |
Methods of calibration for the empirical likelihood ratioJiang, Li January 2006 (has links)
This thesis provides several new calibration methods for the empirical log-likelihood ratio. The commonly used Chi-square calibration is based on the limiting distribu¬tion of this ratio but it constantly suffers from the undercoverage problem. The finite sample distribution of the empirical log-likelihood ratio is recognized to have a mix¬ture structure with a continuous component on [0, +∞) and a probability mass at +∞. Consequently, new calibration methods are developed to take advantage of this mixture structure; we propose new calibration methods based on the mixture distrib¬utions, such as the mixture Chi-square and the mixture Fisher's F distribution. The E distribution introduced in Tsao (2004a) has a natural mixture structure and the calibration method based on this distribution is considered in great details. We also discuss methods of estimating the E distributions.
|
512 |
Survival analysis for breast cancerLiu, Yongcai 21 September 2010 (has links)
This research carries out a survival analysis for patients with breast cancer. The influence of clinical and pathologic features, as well as molecular markers on survival time are investigated. Special
attention focuses on whether the molecular markers can provide additional information in helping predict clinical outcome and guide therapies for breast cancer patients. Three outcomes, breast cancer specific survival (BCSS), local relapse survival (LRS) and distant relapse survival (DRS), are
examined using two datasets, the large dataset with missing values in markers (n=1575) and the small (complete) dataset consisting of patient records without any missing values (n=910). Results show
that some molecular markers, such as YB1, could join ER, PR and HER2 to be integrated
into cancer clinical practices. Further clinical research work is needed to identify the importance of CK56.
The 10 year survival probability at the mean of all the covariates (clinical variables and markers) for BCSS, LRS, and DRS is 77%, 91%, and 72% respectively. Due to the presence of a large portion of missing values in the dataset, a sophisticated multiple imputation method is needed to estimate the missing values so that an unbiased and more reliable analysis can be achieved. In this study, three multiple imputation (MI) methods, data augmentation
(DA), multivariate imputations by chained equations (MICE) and AREG, are employed and compared.
Results shows that AREG is the preferred MI approach. The reliability of MI results are demonstrated using various techniques. This work will hopefully shed light on the determination of appropriate MI
methods for other similar research situations.
|
513 |
Modelling of maximal and submaximal oxygen uptake in men and womenJohnson, Patrick J. January 2002 (has links)
No description available.
|
514 |
New results in dimension reduction and model selectionSmith, Andrew Korb 26 March 2008 (has links)
Dimension reduction is a vital tool in many areas of applied statistics in which the dimensionality of the predictors can be large. In such cases, many statistical methods will fail or yield unsatisfactory results. However, many data sets of high dimensionality actually contain a much simpler, low-dimensional structure. Classical methods such as principal components analysis are able to detect linear structures very effectively, but fail in the presence of nonlinear structures. In the first part of this thesis, we investigate the asymptotic behavior of two nonlinear dimensionality reduction algorithms, LTSA and HLLE. In particular, we show that both algorithms, under suitable conditions, asymptotically recover the true generating coordinates up to an isometry. We also discuss the relative merits of the two algorithms, and the effects of the underlying probability distributions of the coordinates on their performance.
Model selection is a fundamental problem in nearly all areas of applied statistics. In particular, a balance must be achieved between good in-sample performance and out-of-sample prediction. It is typically very easy to achieve good fit in the sample data, but empirically we often find that such models will generalize poorly. In the second part of the thesis, we propose a new procedure for the model selection problem which generalizes traditional methods. Our algorithm allows the combination of existing model selection criteria via a ranking procedure, leading to the creation of new criteria which are able to combine measures of in-sample fit and out-of-sample prediction performance into a single value. We then propose an algorithm which provably finds the optimal combination with a specified probability. We demonstrate through simulations that these new combined criteria can be substantially more powerful than any individual criterion.
|
515 |
Extensions of principal components analysisBrubaker, S. Charles 29 June 2009 (has links)
Principal Components Analysis is a standard tool in data analysis, widely used in data-rich fields such as computer vision, data mining, bioinformatics, and econometrics. For a set of vectors in n dimensions and a natural number k less than n, the method returns a subspace of dimension k whose average squared distance to that set is as small as possible. Besides saving computation by reducing the dimension, projecting to this subspace can often reveal structure that was hidden in high dimension.
This thesis considers several novel extensions of PCA, which provably reveals hidden structure where standard PCA fails to do so. First, we consider Robust PCA, which prevents a few points, possibly corrupted by an adversary, from having a large effect on the analysis. When applied to learning noisy logconcave mixture models, the algorithm requires only slightly more separation between component means than is required for the noiseless case. Second, we consider Isotropic PCA, which can go beyond the first two moments in identifying ``interesting' directions in data. The method leads to the first affine-invariant algorithm that can provably learn mixtures of Gaussians in high dimensions, improving significantly on known results. Thirdly, we define the ``Subgraph Parity Tensor' of order r of a graph and reduce the problem of finding planted cliques in random graphs to the problem of finding the top principal component of this tensor.
|
516 |
Copulas for credit derivative pricing and other applications.Crane, Glenis Jayne January 2009 (has links)
Copulas are multivariate probability distributions, as well as functions which link marginal distributions to their joint distribution. These functions have been used extensively in finance and more recently in other disciplines, for example hydrology and genetics. This study has two components, (a) the development of copula-based mathematical tools for use in all industries, and (b) the application of distorted copulas in structured finance. In the first part of this study, copulabased conditional expectation formulae are described and are applied to small data sets from medicine and hydrology. In the second part of this study we develop a method of improving the estimation of default risk in the context of collateralized debt obligations. Credit risk is a particularly important application of copulas, and given the current global financial crisis, there is great motivation to improve the way these functions are applied. We compose distortion functions with copula functions in order to obtain greater flexibility and accuracy in existing pricing algorithms. We also describe an n-dimensional dynamic copula, which takes into account temporal and spatial changes. / Thesis (Ph.D.) - University of Adelaide, School of Mathematical sciences, 2009
|
517 |
Bayesian analysis for Cox's proportional hazard model with error effect and applications to accelerated life testing dataRodríguez, Iván, January 2007 (has links)
Thesis (M.S.)--University of Texas at El Paso, 2007. / Title from title screen. Vita. CD-ROM. Includes bibliographical references. Also available online.
|
518 |
Data compression for inference tasks in wireless sensor networksChen, Mo. January 2006 (has links)
Thesis (Ph. D.)--State University of New York at Binghamton, Department of Electrical Engineering, Thomas J. Watson School of Engineering and Applied Science, 2006. / Includes bibliographical references.
|
519 |
Asymptotic methods for tests of homogeneity for finite mixture modelsStewart, Michael, January 2002 (has links)
Thesis (Ph. D.)--University of Sydney, 2002. / Title from title screen (viewed Apr. 28, 2008). Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy to the School of Mathematics and Statistics, Faculty of Science. Includes bibliography. Also available in print form.
|
520 |
Bayesian and maximum likelihood methods for some two-segment generalized linear modelsMiyamoto, Kazutoshi. Seaman, John Weldon, January 2008 (has links)
Thesis (Ph.D.)--Baylor University, 2008. / Includes bibliographical references (p.84-86)
|
Page generated in 0.1092 seconds