Return to search

Exploring the diversity of unmapped reads from human deep sequencing

currently DNA and RNA sequencing are performed as standard parts of many scientific experiments. While the majority of the reads produced in these experiments do map to the genome of the organism of interest there are a significant fraction that do not. These reads have often been viewed as uninteresting and thus discarded, sometimes explained as errors created in the sequencing process. However, there may be a real possibility that these reads actually contain genomic sequences belonging to, but not currently in the genome ofthe organism investigated, as well as information about other organisms which live and thrivein the sample material. Considering this, it is of great interest to investigate these reads to see if they contain any usable information. In this project the unmapped reads from SOLiD sequencing of blood and saliva from a twin pair were assembled. The assembled parts were thencompared to different blast databases to investigate if similar genomic regions are reported inother species. We can conclude that indeed a large fraction of the contigs found in this assemblyhave homology to bacterial genes while other contigs share similarity to genomic regions foundin apes and other species closely related to us. All in all the results show that there is more to the unmapped reads than just sequencing errors.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-194782
Date January 2012
CreatorsZarif Saffari, Amin
PublisherUppsala universitet, Institutionen för biologisk grundutbildning
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds