Return to search

A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae

Streptococcus pneumoniae (pneumococcus) is one of the leading causes of mortality in Africa. It asymptomatically colonizes the human nasopharynx. The invasive pneumococcal disease occurs when isolates spread to normally sterile sites such as lungs, blood, and the central nervous system. Colonization, though, does not necessarily lead to infection. Some isolates remain in the upper respiratory tract only, without causing any pathogenic symptoms. This thesis hypothesized that invasive and non-invasive isolates differ genetically. We tested this hypothesis by applying a pan-genome approach using whole-genome sequencing short reads of 1477 samples from Malawi, including those obtained from the nasopharynx of carriers (825 samples) and from the blood and cerebrospinal fluid of patients (652 samples). In-silico serotyping identified 56 serotypes in the cohort and statistical analysis showed that despite the vaccination, the prevalence of serotypes 1 and 12F increased amongst patients. Genomes were assembled, and a reference pan-genome for all strains was built. Short reads were aligned to the core genome, and core variants were called. The population structure was determined based on the distribution of variants in the pan-genome. Finally, genes with a significant presence in the invasive isolates were identified. Functional enrichment analysis of potential virulence genes was carried out to address how specific genes may contribute to the pathogenesis. The findings highlighted the features of the pneumococcus pan-genome in Malawi. The core- and accessory-genome were characterized based on the functional analysis of genes. The core components included: Ribosomal subunits. Subunits of F-type ATP synthase. Enzymes that catalyze the attachment of amino acids to tRNA molecules, DNA replication, DNA repair, and homologous recombination. 10.13% of the core and soft-core genes were uncharacterized. In the accessory genome, the study detected the presence of genes from Regions of Diversity (RDs), including Subunits of V-type ATPases and Sodium/solute symporter from RD8a. Enzymes from RD3 catalyzing the capsule synthesis. Subunits of PsrP secY2A2 pathogenicity island from RD10. Genes from RD6 and RD7 involved in transposing mobile genetic elements. Genes from RD2 RD8b, and RD12 participating in communication and competition. Genes from RD4 that assemble pilins into pili and anchor pili to the cell wall. 53.58% of accessory genes were uncharacterized. Most serotypes showed a similar prevalence in carriage and disease groups. However, the significant abundance of serotypes 1, 5, and 12F among patients compared to the carriage group suggested they are highly invasive with a short colonization period. These serotypes exhibited a remarkable genetic distinction from others. Their divergence included the absence and presence of several genes in their genome structure. The lack of genes from a genomic island known as RD8a was the most pronounced difference between serotypes 1, 5, and 12F compared to significantly prevalent serotypes in the nasopharynx. Genes in RD8a are involved in binding to epithelial cells and doing aerobics respiration to synthesize ATP through oxidative phosphorylation. The absence of RD8a from serotypes 1, 5, and 12F may be associated with their short duration in the nasopharynx where they need to bind to epithelial cells and access free oxygen molecules required for aerobic respiration. Given this, the amount of ATP is likely to decline in serotypes 1, 5, and 12F, causing them to harbour more phosphotransferase systems to transport carbohydrates since these transporters use phosphoenolpyruvate as the energy source instead of ATP. In conclusion, serotypes 1, 5, and 12F, the most prevalent and invasive pneumococcal strains in Malawi, showed a considerable genetic distinction from other strains that may be associated with their short colonization period and quickness to infect the blood and cerebrospinal fluid.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:uct/oai:localhost:11427/38498
Date11 September 2023
CreatorsIranzadeh, Arash
ContributorsMulder, Nicola
PublisherFaculty of Health Sciences, Computational Biology Division
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeDoctoral Thesis, Doctoral, PhD
Formatapplication/pdf

Page generated in 0.0163 seconds