Return to search

Bayesian variable selection in clustering via dirichlet process mixture models

The increased collection of high-dimensional data in various fields has raised a strong
interest in clustering algorithms and variable selection procedures. In this disserta-
tion, I propose a model-based method that addresses the two problems simultane-
ously. I use Dirichlet process mixture models to define the cluster structure and to
introduce in the model a latent binary vector to identify discriminating variables. I
update the variable selection index using a Metropolis algorithm and obtain inference
on the cluster structure via a split-merge Markov chain Monte Carlo technique. I
evaluate the method on simulated data and illustrate an application with a DNA
microarray study. I also show that the methodology can be adapted to the problem
of clustering functional high-dimensional data. There I employ wavelet thresholding
methods in order to reduce the dimension of the data and to remove noise from the
observed curves. I then apply variable selection and sample clustering methods in the
wavelet domain. Thus my methodology is wavelet-based and aims at clustering the
curves while identifying wavelet coefficients describing discriminating local features.
I exemplify the method on high-dimensional and high-frequency tidal volume traces
measured under an induced panic attack model in normal humans.

Identiferoai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/5888
Date17 September 2007
CreatorsKim, Sinae
ContributorsVannucci, Marina
PublisherTexas A&M University
Source SetsTexas A and M University
Languageen_US
Detected LanguageEnglish
TypeBook, Thesis, Electronic Dissertation, text
Format2747270 bytes, electronic, application/pdf, born digital

Page generated in 0.0031 seconds