1 |
Machine annotation of genome and transcriptome dataLiu, Zhe January 2015 (has links)
One of the key research topics of post-genome study is annotation of the gene with regards to specific function and biological processes. This can help us to understand the precise role that a gene or a group of genes carries. In this thesis, I developed techniques to automatically annotate genes on single gene and a group of genes levels. It is shown that these techniques improve our understanding of biological systems/diseases, and will aid drug discovery. In the first project, I attempted to achieve precise annotation for single genes. In the second and third projects, I performed annotations of a group of genes using pathway knowledge. I examined this problem from supervised and unsupervised learning aspects, respectively. The main contributions of the work are organized as follows: In gene annotation project, I built up an automated scheme to reconcile the term differences arising from the different automated annotation services. The method leaves less than 20% of the annotations for manual work. The generalization performance across other species is of a similar standard, again leaving less than 20% of the annotations for manual inspection. In addition, less than 10% of the results have different functions from EcoCyc results in E.coli genome annotation task. Overall, this method can significantly reduce human effort involvement (6 months’ work by several biologists for a bacterial genome) to resolve inconsistent gene annotations. Then I started from the current limitations of pathway analysis and presented a novel approach for pathway discovery. Enrichment analysis is the most popular approach to map gene expression profiling from genes to biological pathways. It is a powerful tool to identify pathways enriching of differentially expressed gene; however, it is unable to discover active/inhibitive pathways. In this study, I attempted to resolve this issue by integrative classification of KEGG and TF gene sets. I assumed that the pathways with good classification performance should be considered as the active/inhibitive pathways. Based on this hypothesis, I built up a generic approach to incorporate two types of biological data for active pathway discovery. The experimental results show that integration of transcription factor data boosts classification performance. In addition, this method identified relevant biological pathways, which are highly associated with tumour genesis and development. But they are ignored by Gene Set Enrichment Analysis, such as cancer pathway, inflammation and metabolic pathways. Furthermore, this method achieves comparable classification performance with the best-reported results. Lastly, I performed subtyping analysis of Rheumatoid Arthritis patients based on gene expression profiling. I revalidated the two clusters of patients based on two independent cohorts. The experimental results indicate that the subgroup structure does not correspond to the drug response status. In addition, I developed a pathway subtyping approach and achieved the same number of clusters as gene-level clustering results. The pathway clustering results show that one group of the patients has high proliferation and low inflammation response, while the other group has the reverse trend. It suggests that designing drugs with better trade-off between anti-inflammation and anti-proliferation for specific subgroup of patients may achieve better clinical outcomes.
|
2 |
Syntéza železo-sirných center v Monocercomonoides exilis / Iron-Sulfur cluster assembly in Monocercomonoides exilisVacek, Vojtěch January 2020 (has links)
In the search for the mitochondrion of oxymonads, DNA of Monocercomonoides exilis - an oxymonad isolated from the gut of Chinchilla, was isolated and its genome was sequenced. Sequencing resulted in a fairly complete genome which was extensively searched or genes for mitochondrion related proteins, but no reliable candidate for such gene was identified. Even genes for the ISC pathway, which is responsible for Fe-S cluster assembly and considered to be the only essential function of reduced mitochondrion-like organelles (MROs), were absent. Instead, we were able to detect the presence of a SUF pathway which functionally replaced the ISC pathway. Closer examination of the SUF pathway based on heterologous localisation revealed that this pathway localised in the cytosol. In silico analysis showed that SUF genes are highly conserved at the level of secondary and tertiary structure and most catalytic residues and motifs are present in their sequences. The functionality of these proteins was further indirectly confirmed by complementation experiments in Escherichia coli where SUF proteins of M. exilis were able to restore at least partially Fe-S cluster assembly of strains deficient in the SUF and ISC pathways. We also proved by bacterial adenylate cyclase two-hybrid system that SufB and SufC can form...
|
Page generated in 0.1173 seconds