Return to search

Algorithms for Viral Population Analysis

The genetic structure of an intra-host viral population has an effect on many clinically important phenotypic traits such as escape from vaccine induced immunity, virulence, and response to antiviral therapies. Next-generation sequencing provides read-coverage sufficient for genomic reconstruction of a heterogeneous, yet highly similar, viral population; and more specifically, for the detection of rare variants. Admittedly, while depth is less of an issue for modern sequencers, the short length of generated reads complicates viral population assembly. This task is worsened by the presence of both random and systematic sequencing errors in huge amounts of data. In this dissertation I present completed work for reconstructing a viral population given next-generation sequencing data. Several algorithms are described for solving this problem under the error-free amplicon (or sliding-window) model. In order for these methods to handle actual real-world data, an error-correction method is proposed. A formal derivation of its likelihood model along with optimization steps for an EM algorithm are presented. Although these methods perform well, they cannot take into account paired-end sequencing data. In order to address this, a new method is detailed that works under the error-free paired-end case along with maximum a-posteriori estimation of the model parameters.

Identiferoai:union.ndltd.org:GEORGIA/oai:scholarworks.gsu.edu:cs_diss-1086
Date12 August 2014
CreatorsMancuso, Nicholas
PublisherScholarWorks @ Georgia State University
Source SetsGeorgia State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceComputer Science Dissertations

Page generated in 0.0017 seconds