Genomics has become an important tool in agriculture. Many modern crop breeding approaches such as genomic selection and genome editing require detailed information of the genomic composition of a crop species. However, the assembly of high-quality genome sequences is prone to technical artifacts that arise from inaccuracies in the sequencing technology and assembly algorithms. This is particularly true for the genomes of cereal crops, which are often very large, repeat-rich, and polyploid. Until recently, the highly continuous assembly of such cereal crop genomes from short-read data was mainly possible with proprietary assembly tools. In this work, we combined data generated with several short-read sequencing protocols and genomics technologies, including paired-end and mate-pair reads with multiple insert sizes, 10X linked reads, Hi-C contacts, and optical maps to assemble a chromosome level reference genome of Digitaria exilis (fonio millet) with open-source tools. Fonio millet is a semi-domesticated cereal orphan crop native to West Africa that has a high potential for desert agriculture. We implemented the TRITEX pipeline - a recently developed open-source pipeline for the assembly of large Triticeae genomes. We modified the pipeline to include 10X and Hi-C reads into the assembly process independently. We then compared the TRITEX assembly to the fonio reference genome, which had previously been assembled from the same input data but using proprietary algorithms. We found
the two assemblies highly similar in content with high concordance in the local order (0.91 Pearson coefficient for alignments). However, we detected many small putative discrepancies between the two assemblies. While the TRITEX assembly was able to produce a highly continuous genome assembly, further work is needed to characterize the putative discrepancies in more detail.
Identifer | oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/660202 |
Date | 11 1900 |
Creators | Gapa, Liubov |
Contributors | Krattinger, Simon G., Biological and Environmental Sciences and Engineering (BESE) Division, Aranda, Manuel, Zuccolo, Andrea |
Source Sets | King Abdullah University of Science and Technology |
Language | English |
Detected Language | English |
Type | Thesis |
Rights | 2020-11-24, At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2020-11-24. |
Page generated in 0.0018 seconds