The purpose of this thesis is to build a classification algorithm using a Genome Wide Association (GWA) study. Briefly, a GWA is a case-control study using genotypes derived from DNA microarrays for thousands of people. These microarrays are able to acquire the genotypes of hundreds of thousands of Single Nucleotide Polymorphisms (SNPs) for a person at a time. In this thesis, we first describe the processes necessary to prepare the data for analysis. Next, we introduce the Naive Bayes classification algorithm and a modification so that effects of a SNP on the disease of interest are weighted by a Bayesian posterior probability of association. This thesis then uses the data from three coronary artery disease GWAs, one as a training set and two as test sets, to build and test the classifier. Finally, this thesis discusses the relevance of the results and the generalizability of this method to future studies.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/28583 |
Date | January 2010 |
Creators | Davies, Robert William |
Publisher | University of Ottawa (Canada) |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 175 p. |
Page generated in 0.1823 seconds