Return to search

Clustering Rare Event Features to Increase Statistical Power

Rare genetic variation has been put forward as a major contributor to the development of disease; however, it is inherently difficult to associate rare variants with disease, as the low number of observations greatly reduces statistical power. Binning is a method that groups several variants together and merges them into a single feature, sacrificing resolution to increase statistical power. Binning strategies are applicable to rare variant analysis in any field, though their effectiveness is dependent on the method used to group variants. This thesis presents a flexible workflow for rare variant analysis, comprised of five sequential steps: identification of rare variants, annotation of those variants, clustering the variants, collapsing those clusters, and statistical analysis. There are no restrictions on which clustering algorithms are applied, so a review of the core clustering paradigms is provided as an introduction for readers unfamiliar with the field. Also presented is RVCLUST, an R package that facilitates all stages of the described workflow and provides a collection of interfaces to common clustering algorithms and statistical tests. The utility of RVCLUST is demonstrated in a genetic analysis of rare variants in gene regulatory regions and their effect on gene expression. The results of this analysis suggest that informed clustering is an effective alternative to existing strategies, discovering the same associations while avoiding the statistical complications introduced by other binning methods.

Identiferoai:union.ndltd.org:VANDERBILT/oai:VANDERBILTETD:etd-04082013-143931
Date12 April 2013
CreatorsSivley, Robert Michael
ContributorsDr. Tricia A. Thornton-Wells, Dr. William S. Bush, Dr. Douglas H. Fisher
PublisherVANDERBILT
Source SetsVanderbilt University Theses
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.library.vanderbilt.edu/available/etd-04082013-143931/
Rightsunrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Vanderbilt University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.

Page generated in 0.0647 seconds