This dissertation focuses on two topics in multivariate statistics. The first part develops an inference procedure and fast computation tool for the modal clustering method proposed by Li et al. (2007). The modal clustering, based on the kernel density estimate, clusters data using their associations within a single mode, with the final number of clusters equaling the number of modes, otherwise known as the modality of the distribution of the data. This method provides a flexible tool for clustering data of low to moderate dimensions with arbitrary distributional shapes. In contrast to Li and colleagues, we expand their method by proposing a procedure that determines the number of clusters in the data. A test statistic and its asymptotic distribution are derived to assess the significance of each mode within the data. The inference procedure is tested on both simulated and real data sets. In addition, an R computing package is developed (Modalclust) that implements the modal clustering procedure using parallel processing which dramatically increases computing speed over the previously available method. This package is available on the Comprehensive R Archive Network (CRAN).
The second part of this dissertation develops methods of statistical monitoring of clinical trials with multiple co-primary endpoints, where success is defined as meeting both endpoints simultaneously. In practice, a group sequential design method is used to stop trials early for promising efficacy, and conditional power is used for futility stopping rules. In this dissertation we show that stopping boundaries for the group sequential design with multiple co-primary endpoints should be the same as those for studies with single endpoints. Lan and Wittes (1988) proposed the B-value tool to calculate the conditional power of single endpoint trials and we extend this tool to calculate the conditional power for studies with multiple co-primary endpoints. We consider the cases of two-arm studies with co-primary normal and binary endpoints and provide several examples of implementation with simulated trials. A fixed-weight sample size re-estimation approach based on conditional power is introduced. Finally, we discuss the possibility of blinded interim analyses for multiple endpoints using the modality inference method introduced in the first part.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/14285 |
Date | 22 January 2016 |
Creators | Cheng, Yansong |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0018 seconds