Return to search

Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders

Recent founder mutations may play important roles in complex diseases and Mendelian disorders. Detecting shared haplotypes of identity by descent (IBD) could facilitate discovery of these mutations. Several programs address this such as threshold-based methods on genetic distance and probabilistic model-based methods, but they are usually limited to only detecting pair-wise shared haplotypes and not providing a comparison between cases and controls.

In this study, a novel algorithm and a applied software package (HaploShare)is developed to detect extended haplotypes that are shared by multiple individuals, which also allows comparisons between cases and controls. A catalog of haplotypes is firstly generated from healthy controls from the same population and used for phasing genotypes in cases. By accounting for all possible haplotype pairs that could explain the genotypes for each individual in a given haplotype block and possible transitions between blocks, the effect of phase uncertainty on detection power is minimized. In cases, haplotypes shared by pairs are identified and used to detect sharing of these haplotypes by different pairs. A likelihood ratio of a shared haplotype due to IBD or chance is estimated for each extended haplotype. Controls are used similarly through many rounds of simulations to obtain an empirical null distribution of the largest likelihood ratios of shared haplotypes, to give statistical estimates of shared haplotypes detected in cases that may be associated with an underlying disease.

Series of tests were performed to investigate the performance of HaploShare. Simulations of shared haplotypes demonstrated that HaploShare has better power not only on the detection of pair-wise shared haplotypes but multiple shared haplotypes in most of the simulation scenarios, comparing with other four commonly used programs. False positive rate (FPR) and the false discovery rate (FDR) were also evaluated by statistical calculation. According to the result, both of the two values were extremely low (FPR = 6.28x10-6 , FDR = 0.006), indicating that very few randomly shared haplotypes can be wrongly reported as IBD by HaploShare.

HaploShare was also tested on real cases on population data and family linkage analysis. 14 out of 173 Hirschsprung's disease cases were reported by HaploShare of carrying a common haplotype of 250 kb in length, which was consistent with previous findings by direct genotyping and candidate approach. Another testing case is an affected family with 8 cases and 9 unaffected individuals. Disease linked region can be correctly identified by traditional methods if all the data and the entire pedigree were provided. HaploShare showed the ability to locate the shared region even when very limited cases are available, which is clearly beyond the detection power of traditional methods.

The results from empirical simulations and real case applications indicate that HaploShare could effectively make use of population genotype information to improve the power of detection of shared haplotypes. The method may extend the findings in human genetics of both complex and single gene diseases. / published_or_final_version / Psychiatry / Doctoral / Doctor of Philosophy

Identiferoai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/205837
Date January 2013
CreatorsYing, Dingge, 应鼎阁
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
LanguageEnglish
Detected LanguageEnglish
TypePG_Thesis
RightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
RelationHKU Theses Online (HKUTO)

Page generated in 0.0017 seconds