Foodborne diseases are a growing health problem today and can be caused by eating food contaminated with bacteria. To monitor known foodborne diseases, institutions keep track of bacteria in surveillance projects. Whole genome sequencing is becoming the new standard method for comparing isolates, which generates large amounts of data. Today, the standard analyses are focused on conserved regions in genomes. The dynamics in less conserved regions can be studied by creating pan-genomes. A pan-genome consists of conserved genes, called core genes, and genes of varied conservation grade, called accessory genes. This thesis aimed to analyse pan-genomes of large datasets from six bacterial species coming from surveillance projects: Campylobacter coli, Campylobacter jejuni, Escherichia coli, Listeria monocytogenes, Salmonella enterica, and Streptococcus pneumoniae. The purpose was to investigate the species dynamics in the genomes and to look at properties of the genomes not included in the standard analyses that are used in surveillance projects today. Bacterial Pan Genome Analysis tool was used for the pan-genome analysis of the six species and datasets of 1,000-2,000 genomes per species were analysed. All species were estimated to have open pan-genomes, meaning the pan-genomes are increasing in size as more genomes are added. Escherichia coli and Salmonella enterica had more dynamic and open genomes compared to the other species. They had the highest number of accessory genes relative to their genome sizes and had the largest accessory segments between core genes. The synteny of the core genes showed high conservation for a part of the core genes in all species. Some core genes always sat directly after each other in the analysed genomes, never having accessory genes between them. Other core genes always had accessory genes between them, indicating very open regions in the genomes. The core genes were evenly distributed through the reference genomes with some regions showing increased gene density for all species. Some regions had a higher gene density for core genes often followed by core genes, and others for core genes often followed by accessory genes. However, the placement of genes needs to be investigated further with more reference genomes to be able to draw confident conclusions.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-445619 |
Date | January 2021 |
Creators | Johansson, Jennifer |
Publisher | Uppsala universitet, Institutionen för medicinsk biokemi och mikrobiologi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UPTEC X ; 21018 |
Page generated in 0.0021 seconds