1 |
Mining Structural and Functional Patterns in Pathogenic and Benign Genetic Variants through Non-negative Matrix FactorizationPeña-Guerra, Karla A 08 1900 (has links)
The main challenge in studying genetics has evolved from identifying variations and their impact on traits to comprehending the molecular mechanisms through which genetic variations affect human biology, including disease susceptibility. Despite having identified a vast number of variants associated with human traits through large scale genome wide association studies (GWAS) a significant portion of them still lack detailed insights into their underlying mechanisms [1]. Addressing this uncertainty requires the development of precise and scalable approaches to discover how genetic variation precisely influences phenotypes at a molecular level. In this study, we developed a pipeline to automate the annotation of structural variant feature effects. We applied this pipeline to a dataset of 33,942 variants from the ClinVar and GnomAD databases, which included both pathogenic and benign associations. To bridge the gap between genetic variation data and molecular phenotypes, I implemented Non-negative Matrix Factorization (NMF) on this large-scale dataset. This algorithm revealed 6 distinct clusters of variants with similar feature profiles. Among these groups, two exhibited a predominant presence of benign variants (accounting for 70% and 85% of the clusters), while one showed an almost equal distribution of pathogenic and benign variants. The remaining three groups were predominantly composed of pathogenic variants, comprising 68%, 83%, and 77% of the respective clusters. These findings revealed valuable insights into the underlying mechanisms contributing to pathogenicity. Further analysis of this dataset and the exploration of disease-related genes can enhance the accuracy of genetic diagnosis and therapeutic development through the direct inference of variants that are likely to affect the functioning of essential genes.
|
2 |
Zpracování dat z vysokokapacitního DNA sekvenování pro studium variability genomu a transkriptomu. / Study of genome and transcriptome variability employing data processing from massive parallel DNA sequencing.Vojta, Petr January 2018 (has links)
Massive parallel sequencing (MPS) data analysis tasks are often computationally demanding and their execution time would take too long using standard computing machines. Thus there is a need for parallelization of this tasks and ability to execute them on a sufficiently powerful computing machines. In the first chapter we describe a newly created platform for resequencing analysis of MPS data - MOLDIMED and novel annotation tool, which is ready to deploy on HPC infrastructure. The second chapter describes MPS approaches in Diamond-Blackfan anaemia (DBA), which is predominantly underlined by mutations in genes encoding ribosomal proteins (RP); however, its etiology remains unexplained in approximately 25% of patients. We performed panel sequencing of all ribosomal genes in DBA patient without previously known molecular pathology. A novel heterozygous RPS7 mutation coding RPS7 p.V134F was found in one female patient and subsequently confirmed in two asymptomatic family members, in whom mild anemia were detected on further examination. Subsequently, we performed whole transcriptome analysis in all family members and patient with RPS7 mutation in comparison with healthy control group and with DBA patients with known mutation in RPS19. We observed dysregulation mainly in signal pathways of translation,...
|
3 |
Medical relevance and functional consequences of protein truncating variantsRivas Cruz, Manuel A. January 2015 (has links)
Genome-wide association studies have greatly improved our understanding of the contribution of common variants to the genetic architecture of complex traits. However, two major limitations have been highlighted. First, common variant associations typically do not identify the causal variant and/or the gene that it is exerting its effect on to influence a trait. Second, common variant associations usually consist of variants with small effects. As a consequence, it is more challenging to harness their translational impact. Association studies of rare variants and complex traits may be able to help address these limitations. Empirical population genetic data shows that deleterious variants are rare. More specifically, there is a very strong depletion of common protein truncating variants (PTVs, commonly referred to as loss-of-function variants) in the genome, a group of variants that have been shown to have large effect on gene function, are enriched for severe disease-causing mutations, but in other instances may actually be protective against disease. This thesis is divided into three parts dedicated to the study of protein truncating variants, their medical relevance, and their functional consequences. First, I present statistical, bioinformatic, and computational methods developed for the study of protein truncating variants and their association to complex traits, and their functional consequences. Second, I present application of the methods to a number of case-control and quantitative trait studies discovering new variants and genes associated to breast and ovarian cancer, type 1 diabetes, lipids, and metabolic traits measured with NMR spectroscopy. Third, I present work on improving annotation of protein truncating variants by studying their functional consequences. Taken together, these results highlight the utility of interrogating protein truncating variants in medical and functional genomic studies.
|
Page generated in 0.133 seconds