Return to search

Computational Tools for Improved Detection, Identification, and Classification of Plant Pathogens Using Genomics and Metagenomics

Plant pathogens are one of the biggest threats to plant health and food security worldwide. To effectively contain plant disease outbreaks, classification and precise identification of pathogens is crucial to determine treatment and preventive measurements. Conventional methods of detection such as PCR may not be sufficient when the pathogen in question is unknown. Advances in sequencing technology have made it possible to sequence entire genomes and metagenomes in real-time and at a relatively low cost, opening an opportunity for the development of alternative methods for detection of novel and unknown plant pathogens. Within this dissertation, an integrated approach is used to reclassify a high-impact group of plant pathogens. Additionally, the application of metagenomics and nanopore sequencing using the Oxford Nanopore Technologies (ONT) MinION for fungal and bacterial plant pathogen detection and precise identification are demonstrated.
To improve the classification of the strains belonging to the Ralstonia solanacearum species complex (RSSC), we performed a meta-analysis using a comparative genomics and a reverse ecology approach to accurately portray and refine the understanding of the diversity and evolution of the RSSC. The groups identified by these approaches were circumscribed and made publicly available through the LINbase web server so future isolates can be properly classified.
To develop a culture-free detection method of plant pathogens, we used metagenomes of various plants and long-read nanopore sequencing to precisely identify plant pathogens to the strain-level and performed phylogenetic analysis with SNP resolution. In the first paper, we used tomato plants to demonstrate the detection power of bacterial plant pathogens. We compared bioinformatics tools for detection at the strain-level using reads and assemblies. In the second paper, we used a read-based approach to test the feasibility of the methodology to precisely detect the fungal pathogen causing boxwood blight. Lastly, with the improvement in nanopore sequencing, we used grapevine petioles to investigate whether we can go beyond detection and identification and do a phylogenetic analysis. We assembled a metagenome-assembled genome (MAG) of almost the same quality as the genomes obtained from cultured isolates and did a phylogenetic analysis with SNP resolution.
Finally, for the cases where there may be no related genome in the database like the pathogen in question, we used machine learning and metagenomics to develop a reference-free approach to detection of plant diseases. We trained eight different machine learning models with reads from healthy and infected plant metagenomes and compared the classification accuracy of reads as belonging to a healthy or infected plant. From the comparison, random forest was the best model in terms of computational resources needed while maintaining a high accuracy (> 0.90). / Doctor of Philosophy / Microbes are present in every environment on the planet and have been on Earth for billions of years. While some microbes are beneficial, others can cause diseases. To differentiate the ones causing diseases from those who do not, looking into the evolutionary forces making them different is crucial to classify and identify them correctly. Although microorganisms cause diseases in humans and animals, the ones causing diseases in plants are one of the biggest threats to plant health and food security worldwide.
In a perfect world, plant diseases would be diagnosed by eye or simple procedures. However, when a plant disease is present, it is not always obvious which organism, if any, is causing the disease making it hard for outbreaks to be detected and contained promptly. With technological advances, it is now possible to obtain all the genetic information of not only one organism but all the organisms living in an environment at a time. This genetic information can then be used to precisely identify what organism is causing a disease in a plant for faster disease diagnosis and, consequently, more efficient disease prevention and control.
In this dissertation, we used the bacterial group, called Ralstonia solanacearum species complex, which can cause different diseases in more than 200 crops, to investigate and understand the evolution and diversity of the members of this group. We also used newly developed technologies to obtain the genetic material of all the organisms living in multiple important plants including tomato, grapevine, and the ornamental bush, boxwood. Using this genetic material, we developed a methodology for the detection of bacteria and a fungus causing plant diseases.
While this works well when the suspected organism or a similar one is available for comparison, the detection of plant diseases in cases where this information is not available is challenging. Machine learning models, where computers can learn complex patterns from data, have the potential to detect pathogens without the need to compare the sequences to sequences of other pathogens. Here we also used the genetic material to train and compare different machine learning models to classify plants as either being infected or healthy.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/113825
Date13 February 2023
CreatorsJohnson, Marcela Aguilera
ContributorsGenetics, Bioinformatics, and Computational Biology, Vinatzer, Boris A., Li, Song, Brown, C. Titus, Pruden, Amy
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.002 seconds