1 |
Detection of orthologs via genetic mapping augmentation2013 January 1900 (has links)
Researchers interested in examining a given species of interest (or target species) that lacks complete sequence data can infer some knowledge of that species from one or more related species that has a complete set of data. To infer knowledge, it is desired to compare the available sequence data between the two species to find orthologs. However, without complete data sets, one cannot be certain of the validity of the detected orthologs.
Using ortholog detection systems in concert with species’ mapping data, researchers can find regions of shared synteny, allowing for more certainty of the detected orthologs as well as allowing inference of some genetic information based on these regions of shared synteny. A pipeline software solution, Detection of Orthologs via Genetic Mapping Augmentation (DOGMA), was developed for this purpose.
DOGMA’s functionality was tested using a target species, Phaseolus vulgaris, which only had partial sequence data available, and a closely related species, Glycine max, which has a fully se- quenced genome. On sequence similarity alone, which is the standard technique for detecting or- thologs, 205 potential orthologs were detected. DOGMA then filtered these results using mapping data from each species to determine that 121 of the 205 were quite likely true orthologs, referred to as putative orthologs, and the remaining 84 were categorized as reduced orthologs as there was either insufficient information present or were clearly outside a noted region of shared synteny. This provides evidence that DOGMA is capable of reducing false positives versus traditional techniques, such as applications based on Reciprocal Best BLAST Hits. If we interpret the output of the Or- tholuge program as the correct answer, DOGMA achieves 95% sensitivity. However, it is possible that some of the reduced orthologs classified by DOGMA are actually Ortholuge’s false positives, since DOGMA is using mapping data. To support this idea, we show DOGMA’s ability to detect false positives in the results of Ortholuge by artificially creating a paralog and removing the real ortholog. DOGMA properly classifies this data as opposed to Ortholuge.
|
2 |
An investigation of human protein interactions using the comparative methodUr-Rehman, Saif January 2012 (has links)
There is currently a large increase in the speed of production of DNA sequence data as next generation sequencing technologies become more widespread. As such there is a need for rapid computational techniques to functionally annotate data as it is generated. One computational method for the functional annotation of protein-coding genes is via detection of interaction partners. If the putative partner has a functional annotation then this annotation can be extended to the initial protein via the established principle of “guilt by association”. This work presents a method for rapid detection of functional interaction partners for proteins through the use of the comparative method. Functional links are sought between proteins through analysis of their patterns of presence and absence amongst a set of 54 eukaryotic organisms. These links can be either direct or indirect protein interactions. These patterns are analysed in the context of a phylogenetic tree. The method used is a heuristic combination of an established accurate methodology involving comparison of models of evolution the parameters of which are estimated using maximum likelihood, with a novel technique involving the reconstruction of ancestral states using Dollo parsimony and analysis of these reconstructions through the use of logistic regression. The methodology achieves comparable specificity to the use of gene coexpression as a means to predict functional linkage between proteins. The application of this method permitted a genome-wide analysis of the human genome, which would have otherwise demanded a potentially prohibitive amount of computational resource. Proteins within the human genome were clustered into orthologous groups. 10 of these proteins, which were ubiquitous across all 54 eukaryotes, were used to reconstruct a phylogeny. An application of the heuristic predicted a set of functional protein interactions in human cells. 1,142 functional interactions were predicted. Of these predictions 1,131 were not present in current protein-protein interaction databases.
|
Page generated in 0.1015 seconds