The rapid replication and high mutation rates of viruses like HIV lead to the formation of a community of highly similar genomes, referred to as a viral quasispecies, in an infected individual. Next-generation sequencing technologies enable researchers to sequence a complete quasispecies community with reduced expense and effort compared to traditional sequencing methods. However, typical sequence assembly software is designed to reconstruct a single genome from sequencing reads rather than a community of highly similar genomes.
We describe and implement a de novo assembly method for reconstructing variants from a quasispecies community using de Bruijn graphs and a novel, heuristic path-construction method designed to identify corresponding variations at long distances across the genome. We predict the relative abundance of reconstructed variants using an approach inspired from Markov chains.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:MWU.1993/9591 |
Date | 23 October 2012 |
Creators | Bristow, Franklin |
Contributors | Van Domselaar, Gary (Computer Science) Domaratzki, Michael (Computer Science), Cameron, Helen (Computer Science) Ball, Blake (Medical Micriobiology) |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Detected Language | English |
Page generated in 0.0023 seconds