Return to search

Rare variant analysis on UK Biobank

Genome-wide Association Studies (GWAS) is the study used to associate common
variants and phenotypes and has uncovered thousands of disease-associated variants.
However, there is limited research on the contribution of a rare variant. The UK
Biobank (UKB) contains detailed medical records and genetic information for nearly
500,000 individuals and offers a great opportunity for genetic association studies on
rare variants. Here we focused on the role of rare protein-coding variants on UKB
phenotypes. We selected three diseases for analysis: breast cancer, hypothyroidism
and type II diabetes. We defined criteria for qualifying variants and pruned the control
group to reduce interference signals from similar phenotypes. We identified the most
known biomarkers for those diseases, such as BRCA1 and BRCA2 gene for breast
cancer, TG and TSHR gene for hypothyroidism and GCK for type II diabetes. This
result supports the model validity and clarifies the contribution of rare variants to
diseases. Moreover, we also tried the geneset based collapsing method to aggregate
information across genes to strengthen the signal from rare variants and build a
diagnosis model that only relies on the genetic information. Our model could achieve
great performance with an AUC of more than 20% improvement for type II diabetes
and breast cancer and more than 90% accuracy for hypothyroidism.

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/676336
Date17 April 2022
CreatorsLiu, Yang
ContributorsHoehndorf, Robert, Biological and Environmental Science and Engineering (BESE) Division, Hauser, Charlotte, Gojobori, Takashi
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0015 seconds