Global ETD Search

1	Predicting gene–phenotype associations in humans and other species from orthologous and paralogous phenotypes Woods, John Oates, III 21 February 2014 (has links) Phenotypes and diseases may be related by seemingly dissimilar phenotypes in other species by means of the orthology of underlying genes. Such "orthologous phenotypes," or "phenologs," are examples of deep homology, and one member of the orthology relationship may be used to predict candidate genes for its counterpart. (There exists evidence of "paralogous phenotypes" as well, but validation is non-trivial.) In Chapter 2, I demonstrate the utility of including plant phenotypes in our database, and provide as an example the prediction of mammalian neural crest defects from an Arabidopsis thaliana phenotype, negative gravitropism defective. In the third chapter, I describe the incorporation of additional phenotypes into our database (including chicken, zebrafish, E. coli, and new C. elegans datasets). I present a method, developed in coordination with Martin Singh-Blom, for ranking predicted candidate genes by way of a k nearest neighbors naïve Bayes classifier drawing phenolog information from a variety of species. The fourth chapter relates to a computational method and application for identifying shared and overlapping pathways which contribute to phenotypes. I describe a method for rapidly querying a database of phenotype--gene associations for Boolean combinations of phenotypes which yields improved predictions. This method offers insight into the divergence of orthologous pathways in evolution. I demonstrate connections between breast cancer and zebrafish methylmercury response (through oxidative stress and apoptosis); human myopathy and plant red light response genes, minus those involved in water deprivation response (via autophagy); and holoprosencephaly and an array of zebrafish phenotypes. In the first appendix, I present the SciRuby Project, which I co-founded in order to bring scientific libraries to the Ruby programming language. I describe the motivation behind SciRuby and my role in its creation. Finally in Appendix B, I discuss the first beta release of NMatrix, a dense and sparse matrix library for the Ruby language, which I developed in part to facilitate and validate rapid phenolog searches. In this work, I describe the concept of phenologs as well as the development of the necessary computational tools for discovering phenotype orthology relationships, for predicting associated genes, and for statistically validating the discovered relationships and predicted associations. / text Deep homology Phenologs Phenotype orthology Phenotype paralogy Homology Gene--phenotype associations Ruby Sciruby Nmatrix k nearest neighbors
2	Prioritizing Causative Genomic Variants by Integrating Molecular and Functional Annotations from Multiple Biomedical Ontologies Althagafi, Azza Th. 20 July 2023 (has links) Whole-exome and genome sequencing are widely used to diagnose individual patients. However, despite its success, this approach leaves many patients undiagnosed. This could be due to the need to discover more disease genes and variants or because disease phenotypes are novel and arise from a combination of variants of multiple known genes related to the disease. Recent rapid increases in available genomic, biomedical, and phenotypic data enable computational analyses, reducing the search space for disease-causing genes or variants and facilitating the prediction of causal variants. Therefore, artificial intelligence, data mining, machine learning, and deep learning are essential tools that have been used to identify biological interactions, including protein-protein interactions, gene-disease predictions, and variant--disease associations. Predicting these biological associations is a critical step in diagnosing patients with rare or complex diseases. In recent years, computational methods have emerged to improve gene-disease prioritization by incorporating phenotype information. These methods evaluate a patient's phenotype against a database of gene-phenotype associations to identify the closest match. However, inadequate knowledge of phenotypes linked with specific genes in humans and model organisms limits the effectiveness of the prediction. Information about gene product functions and anatomical locations of gene expression is accessible for many genes and can be associated with phenotypes through ontologies and machine-learning models. Incorporating this information can enhance gene-disease prioritization methods and more accurately identify potential disease-causing genes. This dissertation aims to address key limitations in gene-disease prediction and variant prioritization by developing computational methods that systematically relate human phenotypes that arise as a consequence of the loss or change of gene function to gene functions and anatomical and cellular locations of activity. To achieve this objective, this work focuses on crucial problems in the causative variant prioritization pipeline and presents novel computational methods that significantly improve prediction performance by leveraging large background knowledge data and integrating multiple techniques. Therefore, this dissertation presents novel approaches that utilize graph-based machine-learning techniques to leverage biomedical ontologies and linked biological data as background knowledge graphs. The methods employ representation learning with knowledge graphs and introduce generic models that address computational problems in gene-disease associations and variant prioritization. I demonstrate that my approach is capable of compensating for incomplete information in public databases and efficiently integrating with other biomedical data for similar prediction tasks. Moreover, my methods outperform other relevant approaches that rely on manually crafted features and laborious pre-processing. I systematically evaluate our methods and illustrate their potential applications for data analytics in biomedicine. Finally, I demonstrate how our prediction tools can be used in the clinic to assist geneticists in decision-making. In summary, this dissertation contributes to the development of more effective methods for predicting disease-causing variants and advancing precision medicine. Whole-Exome Sequencing Whole-Genome Sequencing Disease Genes Disease Variants Disease Phenotypes Causal Variants Prediction Causal Genes Prediction Artificial Intelligence Data Mining Machine Learning Deep Learning Data Analytics Biological Interactions Protein-Protein Interactions Gene-Disease Predictions Variant-Disease Associations Rare Diseases Complex Diseases Gene-Phenotype Associations Ontology Gene Product Functions Anatomical Locations Gene Prioritization Variant Prioritization Loss of Gene Function Background Knowledge Data Biological Knowledge Graph Graph-Based Machine Learning Biomedical Ontologies Linked Biological Data Representation Learning Embeddings Data Integration Precision Medicine Decision-Making Biomedicine.

Search results

Predicting gene–phenotype associations in humans and other species from orthologous and paralogous phenotypes

Prioritizing Causative Genomic Variants by Integrating Molecular and Functional Annotations from Multiple Biomedical Ontologies