Global ETD Search

31	Validation measures for prognostic models for independent and correlated binary and survival outcomes Rahman, M. S. January 2012 (has links) Prognostic models are developed to guide the clinical management of patients or to assess the performance of health institutions. It is essential that performances of these models are evaluated using appropriate validation measures. Despite the proposal of several validation measures for survival outcomes, it is still unclear which measures should be generally used in practice. In this thesis, a simulation study was performed to investigate a range of validation measures for survival outcomes in order to make practical recommendations regarding their use. Measures were evaluated with respect to their robustness to censoring and their sensitivity to the omission of important predictors. Based on the simulation results, from the discrimination measures, Gonen and Heller's K statistic can be recommended for validating a survival risk model developed using the Cox proportional hazards model, since it is both robust to censoring and reasonably sensitive to predictor omission. Royston and Sauerbrei's D statistic can be recommended provided that the distribution of the prognostic index is approximately normal. Harrell's C-index was affected by censoring and cannot be recommended for use with data with more than 30% censoring. The calibration slope can be recommended as a measure of calibration since it is not affected by censoring. The measures of predictive accuracy and explained variation (Graf et al's integrated Brier Score and its R-square version, and Schemper and Henderson's V) cannot be recommended due to their poor performance in the presence of censored data. In multicentre studies patients are typically clustered within centres and are likely to be correlated. Typically, random effects logistic and frailty models are fitted to clustered binary and survival outcomes, respectively. However, limited work has been done to assess the predictive ability of these models. This research extended existing validation measures for independent data, such as the C-index, D statistic, calibration slope, Brier score, and the K statistic for use with random effects/frailty models. Two approaches: the `overall' and `pooled cluster-specific' are proposed. The `overall' approach incorporates comparisons of subjects both within-and between-clusters. The `pooled cluster-specific' measures are obtained by pooling the cluster-specific estimates based on comparisons of subjects within each cluster; the pooling is achieved using a random effects summary statistics method. Each approach can produce three different values for the validation measures, depending on the type of predictions: conditional predictions using the estimates of the random effects or setting these as zero and marginal predictions by integrating out the random effects. Their performances were investigated using simulation studies. The `overall' measures based on the conditional predictions including the random effects performed reasonably well in a range of scenarios and are recommended for validating models when using subjects from the same clusters as the development data. The measures based on the marginal predictions and the conditional predictions that set the random effects to be zero were biased when the intra-cluster correlation was moderate to high and can be used for subjects in new clusters when the intra-cluster correlation coefficient is less than 0.05. The `pooled cluster-specific' measures performed well when the clusters had reasonable number of events. Generally, both the `overall' and `pooled' measures are recommended for use in practice. In choosing a validation measure, the following characteristics of the validation data should be investigated: the level of censoring (for survival outcome), the distribution of the prognostic index, whether the clusters are the same or different to those in the development data, the level of clustering and the cluster size. 519.5
32	Methods for irregularly sampled continuous time processes Li, Z. January 2014 (has links) This thesis will consider methods associated with irregularly spaced sampling of a real-valued continuous time stationary process. The problem of Monte Carlo simulation as well as parametric estimation under irregularly spaced sampling times will be discussed. For the simulation problem, the focus will be on the spectral simulation method. A novel algorithm has been proposed for the determination of the spectral simulation scheme, which is optimal in the sense of achieving required accuracy with minimal computational costs. The problem of parametric estimation under irregularly spaced sampling times will also be discussed. We will adapt the framework stochastic sampling times, in which the irregularity of the sampling times is modeled through a renewal point process over the real line. By constructing a second order discrete time stationary process from sampling, a parametric estimation method based on the well-known Whittle log-likelihood function will be proposed. Asymptotic consistency of the resulting estimator will be proved by borrowing existing results from literature of renewal theory. Moreover the performance issue of this proposed estimation procedure will be investigated further. It will be shown that by calculating the spectral density of the sampled discrete time process through a Discrete Fourier Transform (DFT) approximation, the Whittle log-likelihood function can indeed be evaluated relatively efficiently. This estimation method, however, will induce information loss, which will be shown to be related to the unique properties of the renewal kernel function. Although a accurate analysis of the renewal kernel function is not easy, it is still possible to provide some insights on the determining factors of the information loss through asymptotic calculations. 519.5
33	A pattern-clustering method for longitudinal data : heroin users receiving methadone Lin, C. January 2014 (has links) Methadone is used as a substitute of heroin and there may be certain groups of users according to methadone dosage. In this work we analyze data for 314 participants of a methadone study over 180 days. The data, which is called category-ordered data throughout this study, consists of seven categories in which six categories have an ordinal scale for representing dosages and one category for missing dosages. We develop a clustering method involving the so-called p-dissimilarity, modification of Prediction Strength (PS), a null model test, and two ordering algorithms. (1) The p-dissimilarity is used to measure dissimilarity between the 180-day time series of the participants. It accommodates categorical and ordinal scales by using a parameter p as a switch between data being treated as categorical and ordinal. It measures dissimilarity between observed dosages and missing dosages by using a parameter β. Also, it could be applied in a wider field of applications, such as survey studies in which questions use choices on the Likert scales and a don't know-category. (2) The PS determines the number of clusters by measuring the stability of clusters, and the Average Silhouette Width (ASW) measures coherence. We propose rules to modify PS so that it can be fully applied to hierarchal clustering methods. Next, instead of preselecting a clustering method, we let the data to decide which clustering method to use based on cluster stability and cluster coherence. The partition around medoids (PAM) method is then selected. (3) We propose the null model test to determine the number of clusters (k). Many methods for the determination of number of clusters give values for k ≥ 2 based on cluster compactness and separation, and suggest to use the k with the highest value. Viewing this question from a different perspective, for a fixed k and a selected clustering method, the null model test uses a null model and parametric bootstrap to explore the distribution of a statistic under the null assumption. A hypothesis test for each k can then be performed. For our data, we construct a Markov null model without structure of clusters, in which the distributions of the categories are the same as those of the real data. We apply the null model test to investigate whether the clusters found according to PAM and ASW/PS can be explained by random variation. (4) We use heatplots to evaluate the quality of clustering. A heatplot is a graph that represents data by colour. It consists of horizontal lines representing the data for objects. However, the interpretability of a heatplot strongly depends on the location of the objects along the vertical-axis. We propose two algorithms to locate objects on a heatplot. The first algorithm using multidimensional scaling (MDS) is for general use. The second algorithm using projection vector is for the PAM method. Each of them locates objects in a heatplot. The heatplot can then be used for information visualisation. It displays clustering structures, relationships between objects and clusters in terms of their dissimilarities, locations of medoids, and the density of clusters. Despite the fact that no significant clustering structure is observed, the sequences of categories for clusters are clinically useful. The sequences of categories indicate detoxification. Our data shows participants with low heroin addictions attempted to reduce/quit the use of methadone at the third month. As for participants with high addictions, few attempted to reduce the use of methadone at the fifth month and most required more time to finish the detoxification process. Also, we find the heroin onset age might have an influence on the patterns of detoxification. 519.5
34	Phases at complex temperature : spiral correlation functions and regions of Fisher zeros for Ising models Beichert, Felicitas January 2013 (has links) In this thesis we analyse the Fisher zeros for various Ising models. We show that there is long-range spiral order on the contours of zeros for classical one-dimensional Ising chains and that there is an imaginary “latent heat” associated with crossing those contours. Then we can see how areas of Fisher zeros fill in as we turn the Ising chain into Ising ladders of different widths, which seems to contradict the standard analysis presented in the literature and can be attributed to a different approach to the thermodynamic limit. Finally, we report results on frustrated two-dimensional classical Ising lattices, in particular the triangular and the kagomé lattice, and the quantum Ising problem. 519.5
35	Occupancy modelling : study design and models for data collected along transects Guillera-Arroita, Gurutzeta January 2012 (has links) Occupancy, defined as the proportion of sites occupied by a species, is a state variable of interest in ecology and conservation. When modelling species occupancy it is crucial to account for the detection process, as most species can remain undetected at sites where present. This is usually achieved by carrying out separate repeat visits to each sampling site but other methods are sometimes used, such as surveying spatial sub- units within each sampling site, or even collecting detection data continuously along a transect, during a single visit. This thesis deals with two aspects of occupancy modelling: (i) we explore issues related to the design of occupancy studies, including the trade-off in survey effort allocation between sites and repeat visits, sample size determination and the impact of sampling with replacement in studies based on spatial replication; and (ii) we develop and evaluate new models to estimate occupancy from species detection data collected along transects, motivated by the analysis of a data set from a Surnatran tiger Panthera tigris sumatrae survey which followed this type of sampling protocol. The models we propose, which describe the detection process as a continuous point process, can account for clustering and/or abundance-induced heterogeneity in the detection process and represent a step forward with respect to current modelling approaches which involve data discretisation and two-stage ad hoc procedures. 519.5
36	Algorithims for linear model estimation on massively parallel systems Kontoghiorghes, Erricos John January 1993 (has links) No description available. 519.5
37	The initial conditions problem in dynamic panel data models with random effects Kazemi, Iraj January 2005 (has links) No description available. 519.5
38	Statistical methods for cluster of extreme values Ferro, Christopher A. T. January 2003 (has links) No description available. 519.5
39	Estimating meta analysis parameters in non-standard data Nik Idris, Nik Ruzni January 2006 (has links) No description available. 519.5
40	The statistics of dynamic networks Clegg, Richard G. January 2004 (has links) No description available. 519.5

Search results