In the last years, Genome-Wide Associations Studies (GWAS) found many variants associated with complex diseases. However, the biological and molecular links between these variants and phenotypes are still mostly unknown. Also, even if sample sizes are constantly increasing, the associated variants do not explain all the heritability estimated for many traits.
Many hypotheses have been proposed to explain the problem: from variant-variant interactions, the effect of rare and ultra-rare coding variants and also technical biases related to sequencing or statistic on sexual chromosomes. In this thesis, we mainly explore the hypothesis of variant-variant interaction and, briefly, the rare coding variants hypothesis while also considering possible molecular effects like allele-specific expression and the effects of variants on protein interfaces. Some parts of the thesis are also devoted to explore the implementation of efficient computational tools to explore these effects and to perform scalable genotyping of germline single nucleotide polymorphisms (SNPs) in huge datasets.
The main part of the thesis regards the development of a new resource to identify putative variant-variant interactions. In particular, we integrated ChIP-seq data from ENCODE, transcription factor binding motifs from several resources and genotype and transcript level data from GTeX and TCGA. This new dataset allows us to formalize new models, to make hypothesis and to find putative novel associations and interactions between (mainly non-coding) germline variants and phenotypes, like cancer-specific phenotypes. In particular, we focused on the characterization of breast cancer and Alzheimer’s Disease GWAS risk variants, looking for putative variants’ interactions.
Recently, the study of rare variants has become feasible thanks to the biobanks that made available genotypes and clinical data of thousands of patients. We characterize and explore the possible effects of rare coding inherited polymorphisms on protein interfaces in the UKBioBank trying to understand if the change in structure of protein can be one of the causes of complex diseases.
Another part of the thesis explores variants as causal molecular effect for allele-specific expression. In particular, we describe UTRs variants that can alter the post-transcriptional regulation in mRNA leading to a phenomenon where an allele is more expressed than the other. Finally, we show those variants can have prognostic significance in breast cancer.
This thesis work introduces results and computational tools that can be useful to a broad community of researcher studying human polymorphisms effects.
Identifer | oai:union.ndltd.org:unitn.it/oai:iris.unitn.it:11572/354827 |
Date | 18 October 2022 |
Creators | Valentini, Samuel |
Contributors | Valentini, Samuel, Romanel, Alessandro |
Publisher | Università degli studi di Trento, place:TRENTO |
Source Sets | Università di Trento |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/doctoralThesis |
Rights | info:eu-repo/semantics/openAccess |
Relation | firstpage:1, lastpage:167, numberofpages:167 |
Page generated in 0.0028 seconds