Tracking seed dispersal using traditional, direct measurement approaches is difficult and generally underestimates dispersal distances. Variation in chloroplast haplotypes (cpDNA) offers a way to trace past seed dispersal and to make inferences about factors contributing to present patterns of dispersal. Although cpDNA generally has low levels of intraspecific variation, this can be overcome by assaying the whole chloroplast genome. Whole-genome sequencing is more expensive, but resources can be conserved by pooling samples. Unfortunately, haplotype associations among SNPs are lost in pooled samples and treating SNP frequencies as independent estimates of variation provides biased estimates of genetic distance. I have developed an application, CallHap, that uses a least-squares algorithm to evaluate the fit between observed and predicted SNP frequencies from pooled samples based on network topology, thus enabling pooling for chloroplast sequencing for large-scale studies of chloroplast genomic variation. This method was tested using artificially-constructed test networks and pools, and pooled samples of Lasthenia californica (California goldfields) from Whetstone Prairie, in Southern Oregon, USA. In test networks, CallHap reliably recovered network topologies and haplotype frequencies. Overall, the CallHap pipeline allows for the efficient use of resources for estimation of genetic distance for studies using non-recombining, whole-genome haplotypes, such as intra-specific variation in chloroplast, mitochondrial, bacterial, or viral DNA.
Identifer | oai:union.ndltd.org:pdx.edu/oai:pdxscholar.library.pdx.edu:open_access_etds-5016 |
Date | 02 June 2017 |
Creators | Kohrn, Brendan F. |
Publisher | PDXScholar |
Source Sets | Portland State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Dissertations and Theses |
Page generated in 0.0021 seconds