Global ETD Search

1	Classification models for disease diagnosis and outcome analysis Wu, Tsung-Lin 12 July 2011 (has links) In this dissertation we study the feature selection and classification problems and apply our methods to real-world medical and biological data sets for disease diagnosis. Classification is an important problem in disease diagnosis to distinguish patients from normal population. DAMIP (discriminant analysis -- mixed integer program) was shown to be a good classification model, which can directly handle multigroup problems, enforce misclassification limits, and provide reserved judgement region. However, DAMIP is NP-hard and presents computational challenges. Feature selection is important in classification to improve the prediction performance, prevent over-fitting, or facilitate data understanding. However, this combinatorial problem becomes intractable when the number of features is large. In this dissertation, we propose a modified particle swarm optimization (PSO), a heuristic method, to solve the feature selection problem, and we study its parameter selection in our applications. We derive theories and exact algorithms to solve the two-group DAMIP in polynomial time. We also propose a heuristic algorithm to solve the multigroup DAMIP. Computational studies on simulated data and data from UCI machine learning repository show that the proposed algorithm performs very well. The polynomial solution time of the heuristic method allows us to solve DAMIP repeatedly within the feature selection procedure. We apply the PSO/DAMIP classification framework to several real-life medical and biological prediction problems. (1) Alzheimer's disease: We use data from several neuropsychological tests to discriminate subjects of Alzheimer's disease, subjects of mild cognitive impairment, and control groups. (2) Cardiovascular disease: We use traditional risk factors and novel oxidative stress biomarkers to predict subjects who are at high or low risk of cardiovascular disease, in which the risk is measured by the thickness of the carotid intima-media or/and the flow-mediated vasodilation. (3) Sulfur amino acid (SAA) intake: We use 1H NMR spectral data of human plasma to classify plasma samples obtained with low SAA intake or high SAA intake. This shows that our method helps for metabolomics study. (4) CpG islands for lung cancer: We identify a large number of sequence patterns (in the order of millions), search candidate patterns from DNA sequences in CpG islands, and look for patterns which can discriminate methylation-prone and methylation-resistant (or in addition, methylation-sporadic) sequences, which relate to early lung cancer prediction. Biomedical informatics applications Medical sciences Statistical methods Discriminant analysis Data mining Bioinformatics Medical informatics
2	Joint models for longitudinal and survival data Yang, Lili 11 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Epidemiologic and clinical studies routinely collect longitudinal measures of multiple outcomes. These longitudinal outcomes can be used to establish the temporal order of relevant biological processes and their association with the onset of clinical symptoms. In the first part of this thesis, we proposed to use bivariate change point models for two longitudinal outcomes with a focus on estimating the correlation between the two change points. We adopted a Bayesian approach for parameter estimation and inference. In the second part, we considered the situation when time-to-event outcome is also collected along with multiple longitudinal biomarkers measured until the occurrence of the event or censoring. Joint models for longitudinal and time-to-event data can be used to estimate the association between the characteristics of the longitudinal measures over time and survival time. We developed a maximum-likelihood method to joint model multiple longitudinal biomarkers and a time-to-event outcome. In addition, we focused on predicting conditional survival probabilities and evaluating the predictive accuracy of multiple longitudinal biomarkers in the joint modeling framework. We assessed the performance of the proposed methods in simulation studies and applied the new methods to data sets from two cohort studies. / National Institutes of Health (NIH) Grants R01 AG019181, R24 MH080827, P30 AG10133, R01 AG09956. joint models longitudinal data survival data bivariate change point models prediction Bayesian method EM algorithm Biologically-inspired computing Probability measures Expectation-maximization algorithms Failure time data analysis Numerical analysis -- Data processing Clinical trials -- Statistical methods

Search results

Classification models for disease diagnosis and outcome analysis

Joint models for longitudinal and survival data