The immune system has the ability to discriminate self from non-self proteins and also make appropriate immune responses to pathogens. A fundamental problem is to understand the genomic differences and similarities among the sets of self peptides and non-self peptides. The sequencing of human, mouse and numerous pathogen genomes and cataloging of their respective proteomes allows host self and non-self peptides to be identified. T-cells make this determination at the peptide level based on peptides displayed by MHC molecules.<p>In this project, peptides of specific lengths (k-mers) are generated from each protein in the proteomes of various model organisms. The set of unique k-mers for each species is stored in a library and defines its "immunological self". Using the libraries, organisms can be compared to determine the levels of peptide overlap. The observed levels of overlap can also be compared with levels which can be expected "at random" and statistical conclusions drawn.<p>A problem with this procedure is that sequence information in public protein databases (Swiss-PROT, UniProt, PIR) often contains ambiguities. Three strategies for dealing with such ambiguities have been explored in earlier work and the strategy of removing ambiguous k-mers is used here.<p>Peptide fragments (k-mers) which elicit immune responses are often localized within the sequences of proteins from pathogens. These regions are known as "immunodominants" (i.e., hot spots) and are important in immunological work. After investigating the peptide universes and their overlaps, the question of whether known regions of immunological significance (e.g., epitope) come from regions of low host-similarity is explored. The known regions of epitopes are compared with the regions of low host-similarity (i.e., non-overlaps) between HIV-1 and human proteomes at the 7-mer level. Results show that the correlation between these two regions is not statistically significant. In addition, pairs involving human and human viruses are explored. For these pairs, one graph for each k-mer level is generated showing the actual numbers of matches between organisms versus the expected numbers. From graphs for 5-mer and 6-mer level, we can see that the number of overlapping occurrences increases as the size of the viral proteome increases.<p>A detailed investigation of the overlaps/non-overlaps between viral proteome and human proteome reveals that the distribution of the locations of these overlaps/non-overlaps may have "structure" (e.g. locality clustering). Thus, another question that is explored is whether the locality clustering is statistically significant. A chi-square analysis is used to analyze the locality clustering. Results show that the locality clusterings for HIV-1, HIV-2 and Influenza A virus at the 5-mer, 6-mer and 7-mer levels are statistically significant. Also, for self-similarity of human protein Desmoglein 3 to the remaining human proteome, it shows that the locality clustering is not statistically significant at the 5-mer level while it is at the 6-mer and 7-mer levels.
Identifer | oai:union.ndltd.org:USASK/oai:usask.ca:etd-02152007-125646 |
Date | 15 February 2007 |
Creators | Li, Ying |
Contributors | McQuillan, Ian, Kusalik, Anthony J. (Tony), Bickis, Mikelis G., Angel, Joseph F. |
Publisher | University of Saskatchewan |
Source Sets | University of Saskatchewan Library |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://library.usask.ca/theses/available/etd-02152007-125646/ |
Rights | unrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to University of Saskatchewan or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. |
Page generated in 0.0019 seconds