• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 79
  • 13
  • 10
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 155
  • 155
  • 96
  • 21
  • 21
  • 18
  • 16
  • 15
  • 14
  • 14
  • 13
  • 13
  • 12
  • 12
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Capacity building for whole genome sequencing of Mycobacterium tuberculosis and bioinformatics in high TB burden countries.

Rivière, E., Heupink, T.H., Ismail, N., Dippenaar, A., Clarke, C., Abebe, G., Heusden van, P., Warren, R., Meehan, Conor J., Van Rie, A. 18 June 2021 (has links)
Yes / Whole genome sequencing (WGS) is increasingly used for Mycobacterium tuberculosis (Mtb) research. Countries with the highest tuberculosis (TB) burden face important challenges to integrate WGS into surveillance and research. We assessed the global status of Mtb WGS and developed a 3-week training course coupled with long-term mentoring and WGS infrastructure building. Training focused on genome sequencing, bioinformatics and development of a locally relevant WGS research project. The aim of the long-term mentoring was to support trainees in project implementation and funding acquisition. The focus of WGS infrastructure building was on the DNA extraction process and bioinformatics. Compared to their TB burden, Asia and Africa are grossly underrepresented in Mtb WGS research. Challenges faced resulted in adaptations to the training, mentoring and infrastructure building. Out-of-date laptop hardware and operating systems were overcome by using online tools and a Galaxy WGS analysis pipeline. A case studies approach created a safe atmosphere for students to formulate and defend opinions. Because quality DNA extraction is paramount for WGS, a biosafety level 3 and general laboratory skill training session were added, use of commercial DNA extraction kits was introduced and a 2-week training in a highly equipped laboratory was combined with a 1-week training in the local setting. By developing and sharing the components of and experiences with a sequencing and bioinformatics training program, we hope to stimulate capacity building programs for Mtb WGS and empower high-burden countries to play an important role in WGS-based TB surveillance and research. / Vlaamse Interuniversitaire Raad-secretariaat voor universitaire ontwikkelingssamenwerking (ET2018JOI008A10); the Research Foundation Flanders under FWO Odysseus (grant G0F8316N); the South African Research Chairs Initiative of the Department of Science and Technology and National Research Foundation of South Africa (64751); the South African Medical Research Council.
32

Characterization of Genomic Variants Associated with Resistance to Bedaquiline and Delamanid in Naive Mycobacterium tuberculosis Clinical Strains

Battaglia, S., Spitaleri, A., Cabibbe, A.M., Meehan, Conor J., Utpatel, C., Ismail, N., Tahseen, S., Skrahina, A., Alikhanova, N., Mostofa Kamal, S.M., Barbova, A., Niemann, S., Groenheit, R., Dean, A.S., Zignol, M., Rigouts, L., Cirillo, D.M. 18 June 2021 (has links)
no / The role of mutations in genes associated with phenotypic resistance to bedaquiline (BDQ) and delamanid (DLM) in Mycobacterium tuberculosis complex (MTBc) strains is poorly characterized. A clear understanding of the genetic variants' role is crucial to guide the development of molecular-based drug susceptibility testing (DST). In this work, we analyzed all mutations in candidate genomic regions associated with BDQ- and DLM-resistant phenotypes using a whole-genome sequencing (WGS) data set from a collection of 4,795 MTBc clinical isolates from six countries with a high burden of tuberculosis (TB). From WGS analysis, we identified 61 and 163 unique mutations in genomic regions potentially involved in BDQ- and DLM-resistant phenotypes, respectively. Importantly, all strains were isolated from patients who likely have never been exposed to these medicines. To characterize the role of mutations, we calculated the free energy variation upon mutations in the available protein structures of Ddn (DLM), Fgd1 (DLM), and Rv0678 (BDQ) and performed MIC assays on a subset of MTBc strains carrying mutations to assess their phenotypic effect. The combination of structural and phenotypic data allowed for cataloguing the mutations clearly associated with resistance to BDQ (n = 4) and DLM (n = 35), only two of which were previously described, as well as about a hundred genetic variants without any correlation with resistance. Significantly, these results show that both BDQ and DLM resistance-related mutations are diverse and distributed across the entire region of each gene target, which is of critical importance for the development of comprehensive molecular diagnostic tools.
33

Genomic selection in farm animals: accuracy of prediction and applications with imputed whole-genome sequencing data in chicken

Ni, Guiyan 10 February 2016 (has links)
Methoden zur genomischen Vorhersage basierend auf Genotypinformationen von Single Nucleotide Polymorphism (SNP)-Arrays mit unterschiedlicher Markeranzahl sind mittlerweile in vielen Zuchtprogrammen für Nutztiere fest implementiert. Mit der zunehmenden Verfügbarkeit von vollständigen Genomsequenzdaten, die auch kausale Mutationen enthalten, werden mehr und mehr Studien veröffentlicht, bei denen genomische Vorhersagen beruhend auf Sequenzdaten durchgeführt werden. Das Hauptziel dieser Arbeit war zu untersuchen, inwieweit SNP-Array-Daten mit statistischen Verfahren bis zum Sequenzlevel ergänzt werden können (sogenanntes „Imputing“) (Kapitel 2) und ob die genomische Vorhersage mit imputeten Sequenzdaten und zusätzlicher Information über die genetische Architektur eines Merkmals verbessert werden kann (Kapitel 3). Um die Genauigkeit der genomischen Vorhersage besser verstehen und eine neue Methode zur Approximation dieser Genauigkeit ableiten zu können, wurde außerdem eine Simulationsstudie durchgeführt, die den Grad der Überschätzung der Genauigkeit der genomischen Vorhersage verschiedener bereits bekannter Ansätze überprüfte (Kapitel 4). Der technische Fortschritt im letzten Jahrzehnt hat es ermöglicht, in relativ kurzer Zeit Millionen von DNA-Abschnitten zu sequenzieren. Mehrere auf unterschiedlichen Algorithmen basierende Software-Programme zur Auffindung von Sequenzvarianten (sogenanntes „Variant Calling“) haben sich etabliert und es möglich gemacht, SNPs in den vollständigen Genomsequenzdaten zu detektieren detektieren. Oft werden nur wenige Individuen einer Population vollständig sequenziert und die Genotypen der anderen Individuen, die mit einem SNP-Array an einer Teilmenge dieser SNPs typisiert wurden, imputet. In Kapitel 2 wurden deshalb anhand von 50 vollständig sequenzierten Weiß- und Braunleger-Individuen die mit drei unterschiedlichen Variant-Calling-Programmen (GATK, freebayes and SAMtools) detektierten Genomvarianten verglichen und die Qualität der Genotypen überprüft. Auf den untersuchten Chromosomen 3,6 und 26 wurden 1.741.573 SNPs von allen drei Variant Callers detektiert was 71,6% (81,6%, 88,0%) der Anzahl der von GATK (SAMtools, freebayes) detektierten Varianten entspricht. Die Kenngröße der Konkordanz der Genotypen („genotype concordance“), die durch den Anteil der Individuen definiert ist, deren Array-basierte Genotypen mit den Sequenz-basierten Genotypen an allen auch auf dem Array vorhandenen SNPs übereinstimmt, betrug 0,98 mit GATK, 0,98 mit SAMtools und 0,97 mit freebayes (Werte gemittelt über SNPs auf den untersuchten Chromosomen). Des Weiteren wiesen bei Nutzung von GATK (SAMtools, freebayes) 90% (88 %, 75%) der Varianten hohe Werte (>0.9) anderer Qualitätsmaße (non-reference sensitivity, non-reference genotype concordance und precision) auf. Die Leistung aller untersuchten Variant-Calling-Programme war im Allgemeinen sehr gut, besonders die von GATK und SAMtools. In dieser Studie wurde außerdem in einem Datensatz von ungefähr 1000 Individuen aus 6 Generationen die Güte des Imputings von einem hochdichten SNP-Array zum Sequenzlevel untersucht. Die Güte des Imputings wurde mit Hilfe der Korrelationen zwischen imputeten und wahren Genotypen pro SNP oder pro Individuum und der Anzahl an Mendelschen Konflikten bei Vater-Nachkommen-Paaren beschrieben. Drei unterschiedliche Imputing-Programme (Minimac, FImpute und IMPUTE2) wurden in unterschiedlichen Szenarien validiert. Bei allen Imputing-Programmen betrug die Korrelation zwischen wahren und imputeten Genotypen bei 1000 Array-SNPs, die zufällig ausgewählt und deren Genotypen im Imputing-Prozess als unbekannt angenommen wurden, durchschnittlich mehr als 0.95 sowie mehr als 0.85 bei einer Leave-One-Out-Kreuzvalidierung, die mit den sequenzierten Individuen durchgeführt wurde. Hinsichtlich der Genotypenkorrelation zeigten Minimac und IMPUTE2 etwas bessere Ergebnisse als FImpute. Dies galt besonders für SNPs mit niedriger Frequenz des selteneren Allels. FImpute wies jedoch die kleinste Anzahl von Mendelschen Konflikten in verfügbaren Vater-Nachkommen-Paaren auf. Die Korrelation zwischen wahren und imputeten Genotypen blieb auf hohem Niveau, auch wenn die Individuen, deren Genotypen imputet wurden, einige Generationen jünger waren als die sequenzierten Individuen. Zusammenfassend zeigte in dieser Studie GATK die beste Leistung unter den getesteten Variant-Calling-Programmen, während Minimac sich unter den untersuchten Imputing-Programmen als das beste erwies. Aufbauend auf den Ergebnissen aus Kapitel 2 wurden in Kapitel 3 Studien zur genomischen Vorhersage mit imputeten Sequenzdaten durchgeführt. Daten von 892 Individuen aus 6 Generationen einer kommerziellen Braunlegerlinie standen hierfür zur Verfügung. Diese Tiere waren alle mit einem hochdichten SNP-Array genotypisiert. Unter der Nutzung der Daten von 25 vollständig sequenzierten Individuen wurden jene Tiere ausgehend von den Array-Genotypen bis zum Sequenzlevel hin imputet. Das Imputing wurde mit Minimac3 durchgeführt, das bereits haplotypisierte Daten (in dieser Studie mit Beagle4 erzeugt) als Input benötigt. Die Genauigkeit der genomischen Vorhersage wurde durch die Korrelation zwischen de-regressierten konventionellen Zuchtwerten und direkt genomischen Zuchtwerten für die Merkmale Bruchfestigkeit, Futteraufnahme und Legerate gemessen. Neben dem Vergleich der Genauigkeit der auf SNP-Array-Daten und Sequenzdaten basierenden genomischen Vorhersage wurde in dieser Studie auch untersucht, wie sich die Verwendung verschiedener genomischer Verwandtschaftsmatrizen, die die genetische Architektur berücksichtigen, auf die Vorhersagegenauigkeit auswirkt. Hierbei wurden neben dem Basisszenario mit gleichgewichteten SNPs auch Szenarien mit Gewichtungsfaktoren, nämlich den -(〖log〗_10 P)-Werten eines t-Tests basierend auf einer genomweiten Assoziationsstudie und den quadrierten geschätzten SNP-Effekten aus einem Random Regression-BLUP-Modell, sowie die Methode BLUP|GA („best linear unbiased prediction given genetic architecture“) überprüft. Das Szenario GBLUP mit gleichgewichteten SNPs wurde sowohl mit einer Verwandtschaftsmatrix aus allen verfügbaren SNPs oder nur derer in Genregionen, jeweils ausgehend von der Grundmenge aller imputeten SNPs in der Sequenz oder der Array-SNPs, getestet. Gemittelt über alle untersuchten Merkmale war die Vorhersagegenauigkeit mit SNPs aus Genregionen, die aus den imputeten Sequenzdaten extrahiert wurden, mit 0,366 ± 0,075 am höchsten. Den zweithöchsten Wert erreichte die genomische Vorhersage mit SNPs aus Genregionen, die im SNP-Array erhalten sind (0,361 ± 0,072). Weder die Verwendung gewichteter genomischer Verwandtschaftsmatrizen noch die Anwendung von BLUP|GA führten im Vergleich zum normalen GBLUP-Ansatz zu höheren Vorhersagegenauigkeiten. Diese Beobachtung war unabhängig davon, ob SNP-Array- oder imputete Sequenzdaten verwendet wurden. Die Ergebnisse dieser Studie zeigten, dass kaum oder kein Zusatznutzen durch die Verwendung von imputeten Sequenzdaten generiert werden kann. Eine Erhöhung der Vorhersagegenauigkeit konnte jedoch erreicht werden, wenn die Verwandschaftsmatrix nur aus den SNPs in Genregionen gebildet wurde, die aus den Sequenzdaten extrahiert wurden. Die Auswahl der Selektionskandidaten erfolgt in genomischen Selektionsprogrammen mit Hilfe der geschätzten genomischen Zuchtwerte (GBVs). Die Genauigkeit des GBV ist hierbei ein relevanter Parameter, weil sie die Stabilität der geschätzten Zuchtwerte beschreibt und zeigen kann, wie sich der GBV verändern kann, wenn mehr Informationen verfügbar werden. Des Weiteren ist sie einer der entscheidenden Faktoren beim erwarteten Zuchtfortschritt (auch als so genannte „Züchtergleichung“ beschrieben). Diese Genauigkeit der genomischen Vorhersage ist jedoch in realen Daten schwer zu quantifizieren, da die wahren Zuchtwerte (TBV) nicht verfügbar sind. In früheren Studien wurden mehrere Methoden vorgeschlagen, die es ermöglichen, die Genauigkeit von GBV durch Populations- und Merkmalsparameter (z.B. effektive Populationsgröße, Sicherheit der verwendeten Quasi-Phänotypen, Anzahl der unabhängigen Chromosomen-Segmente) zu approximieren. Weiterhin kann die Genauigkeit bei Verwendung von gemischten Modellen mit Hilfe der Varianz des Vorhersagefehlers abgeleitet werden. In der Praxis wiesen die meisten dieser Ansätze eine Überschätzung der Genauigkeit der Vorhersage auf. Deshalb wurden in Kapitel 4 mehrere methodische Ansätze aus früheren Arbeiten in simulierten Daten mit unterschiedlichen Parametern, mit Hilfe derer verschiedene Tierzuchtprogramme (neben einem Basisszenario ein Rinder- und ein Schweinezuchtschema) abgebildet wurden, überprüft und die Höhe der Überschätzung gemessen. Außerdem wurde in diesem Kapitel eine neue und leicht rechenbare Methode zur Approximation der Genauigkeit vorgestellt Die Ergebnisse des Vergleichs der methodischen Ansätze in Kapitel 4 zeigten, dass die Genauigkeit der GBV durch den neuen Ansatz besser vorhergesagt werden kann. Der vorgestellte Ansatz besitzt immer noch einen unbekannten Parameter, für den jedoch eine Approximation möglich ist, wenn in einem geeigneten Datensatz Ergebnisse von Zuchtwertschätzungen zu zwei verschiedenen Zeitpunkten vorliegen. Zusammenfassend kann gesagt werden, dass diese neue Methode die Approximation der Genauigkeit des GBV in vielen Fällen verbessert.
34

Quantitative study of Clostridium difficile transmission using extensive epidemiological data and whole genome sequencing

Eyre, David William January 2013 (has links)
Clostridium difficile is a leading healthcare-associated infection, which causes diarrhoea, and is almost exclusively precipitated by antibiotic exposure. Traditionally C. difficile infection (CDI) has been considered predominantly transmitted within hospitals. However, endemic spread hampers identification of the source of infections, and therefore control and prevention of disease. A cohort of consecutive hospital and community CDI cases in Oxfordshire from September 2007 to March 2011 was investigated. For each case hospital admission, ward movement and demographic data were available allowing contact events between cases to be reconstructed. Initially 944 cases to March 2010 underwent multilocus sequence typing (MLST), subdividing the endemic cases into 69 distinct lineages and demonstrating unexpectedly that ward-based contact with known symptomatic CDI cases only accounts for <25% of disease. To better determine the extent of transmission arising from symptomatic patients, irrespective of the route transmission, isolates from 1223 cases to March 2011 underwent whole genome sequencing. Serially sampled patients with recurrent or on-going disease were used to estimate rates of C. difficile evolution and within-host diversity and to show 0-2 single nucleotide variants (SNVs) are expected between transmitted isolates obtained <124 days apart (95% prediction interval). Mixed infection with more than one strain was investigated, but probably plays only a minor role in onward transmission. In the Oxfordshire CDI cohort, 333/957 (35%) CDI from April 2008 – March 2011 were within 2 SNVs of ≥1 previous case since September 2007 (consistent with transmission). 428/957 (45%) were >10SNVs from all previous cases: these distinct subtypes continued to be identified consistently throughout the study, suggesting cases arise from a considerable reservoir of C. difficile. Surprisingly, declines in the incidence of genetically-related CDI were similar to those in genetically distinct CDI suggesting interventions not just targeting symptomatic individuals, e.g. antimicrobial stewardship, have played a significant role in recent CDI declines. Finally, the feasibility of studying asymptomatic inpatients as potential source of the unexplained transmission was investigated. This thesis provides convincing evidence, in a setting with typical CDI incidence and infection control practice, that only the minority of CDI arises from other symptomatic cases. It demonstrates that much CDI arises from genetically diverse reservoirs, with each exposure resulting in relatively few secondary cases. Future control strategies therefore need to focus on identifying these reservoirs, one of which is plausibly asymptomatic inpatients, and also on interventions that prevent the transition from exposure and colonisation to disease, such as antimicrobial stewardship.
35

Applications of whole genome sequencing to understanding the mechanisms, evolution and transmission of antibiotic resistance in Escherichia coli and Klebsiella pneumonia

Stoesser, Nicole Elinor January 2014 (has links)
Whole genome sequencing (WGS) has transformed molecular infectious diseases epidemiology in the last five years, and represents a high resolution means by which to catalogue the genetic content and variation in bacterial pathogens. This thesis utilises WGS to enhance our understanding of antimicrobial resistance in two clinically important members of the Enterobacteriaceae family of bacteria, namely Escherichia coli and Klebsiella pneumoniae. These organisms cause a range of clinical infections globally, and are increasing in incidence. The rapid emergence of multi-drug resistance in association with infections caused by them represents a major threat to the effective management of a range of clinical conditions. The reliability of sequencing and bioinformatic methods in the analysis of E. coli and K. pneumoniae sequence data is assessed in chapter 4, and provides a context for the subsequent study chapters, investigating resistance genotype prediction, outbreak epidemiology in two different contexts, and population structure of an important global drug-resistant E. coli lineage, ST131 (5-8). In these, the advantages (and limitations) of short-read, high-throughput, WGS in defining resistance gene content, associated mobile genetic elements and host bacterial strains, and the relationships between them, are discussed. The overarching conclusion is that the dynamic between all the components of the genetic hierarchy involved in the transmission of important antimicrobial resistance elements is extremely complicated, and encompasses almost every imaginable scenario. Complete/near-complete assessment of the genetic content of both chromosomal and episomal components will be a prerequisite to understanding the evolution and spread of antimicrobial resistance in these organisms.
36

Study of the dissemination of cefoxitin-resistant Salmonella enterica serovar Heidelberg from human, abattoir poultry and retail poultry sources

Edirmanasinghe, Romaine Cathy Shalini 15 September 2016 (has links)
This study characterized Salmonella enterica serovar Heidelberg from human, abattoir poultry and retail poultry isolates to examine the molecular relationships of cefoxitin resistance between these groups. A total of 147 S. Heidelberg (70 cefoxitin-resistant and 77 cefoxitin-susceptible) isolates were studied. All cefoxitin-resistant isolates were also resistant to amoxicillin-clavulanic acid, ampicillin, ceftiofur and ceftriaxone, and all contained the CMY-2 gene. Pulsed-field gel electrophoresis typing illustrated that 93.9% isolates clustered together with ≥ 90% similarity. Core genome analysis using whole genome sequencing identified 12 clusters of isolates with zero to four single nucleotide variations. These clusters consisted of cefoxitin-resistant and susceptible human, abattoir poultry and retail poultry isolates. Analysis of CMY-2 plasmids from cefoxitin-resistant isolates revealed all belonged to incompatibility group I1. Analysis of plasmid sequences using WGS revealed high identity (95-99%) to a previously described plasmid (pCVM29188_101) found in Salmonella Kentucky. When compared to pCVM29188_101, all sequenced cefoxitin-resistant isolates were found to carry one of ten possible variant plasmids. The discovery of several clusters of isolates from different sources with zero to four SNVs suggests that transmission between human, abattoir poultry and retail poultry sources may be occurring. The classification of newly sequenced plasmids into one of ten sequence variant types suggests transmission of a common CMY-2 plasmid amongst S. Heidelberg with variable genetic backgrounds. / October 2016
37

Investigation of in-hospital norovirus transmission using whole genome sequencing

Wong, Tse Hua Nicholas January 2014 (has links)
Norovirus is the commonest cause of viral gastroenteritis, affecting all age groups worldwide. Outbreaks frequently occur in semi-closed communities such as schools, cruise ships, prisons and hospitals. Within the healthcare environment, the economic and logistical burdens and the inconvenience caused by norovirus is significant, since ward closure remains central to infection control. The aim of this study was to investigate norovirus transmission dynamics during hospital outbreaks. The ultimate goal was to provide information that could, in future, lead to the development of novel, less disruptive approaches to curtailing the spread of infection. The study explored the application of 'next generation' high throughput DNA sequencing technologies to the determination of large numbers of norovirus genomes. Whole genome sequences provide the highest possible level of discrimination among viruses, information which is essential to the identification of linked and independent cases of infection. The approach exploits the high norovirus mutation rate, which is typical of RNA viruses. Consequently, viruses within a single ward which differ by more than a few SNVs can be considered to represent independent introductions, rather than a single outbreak. Whole genome sequence data (determined for noroviruses collected between 2009 and 2013) were combined with epidemiological data, providing further insights into transmission dynamics. These data identified multiple independent virus introductions during single ward outbreaks. The possible origin of such outbreaks in Oxfordshire hospitals were investigated using viruses originating in the local community, and in other healthcare environments distributed throughout the UK. Whole genome sequences of noroviruses from consecutive years were genetically divergent, confirming the rapid evolution of the virus over time and excluding the possibility of prolonged environmental contamination as a reservoir of infection. Such detailed information on norovirus transmission within the healthcare environment could inform alternative future approaches to optimising infection control within the healthcare setting.
38

Diversité du microbiote digestif humain par culturomics et pyroséquençage

Lagier, Jean-Christophe 15 May 2013 (has links)
Les relations entre le microbiote intestinal et la santé humaine ont été suggérées par les études métagénomiques. Microbial culturomics utilise de nombreuses conditions de culture avec une méthode d'identification rapide par MALDI-TOF, ou par séquençage de l'ARN 16S pour les colonies non identifiées. L'étude pionnière a permis d'identifier 340 bactéries différentes, dont 31 nouvelles espèces bactériennes et 174 espèces bactériennes décrite pour la première de l'intestin humain. Le séquençage du génome de chaque nouvelle espèce a permis de décrire le plus grand génome d'une bactérie isolée chez l'homme (Microvirga massiliensis, 9,3 Mo) et de générer environ 10 000 gènes précédemment inconnus (ORFans) facilitant les futures études métagénomiques. Le pyroséquençage sur les mêmes échantillons a révélé que seulement 51 espèces étaient détectées par les 2 techniques. Culturomics a démontré sa supériorité par rapport au pyroséquençage lors de l'étude d'une selle d'une patiente traitée pour une tuberculose ultra-résistante. Le pyroséquençage de 2 échantillons de selles de patients traités par antibiotiques a révélé 45 à 80% de séquences assignées au phylum des Verrucomicrobia. L'étude par culture de ces mêmes échantillons n'a pas permis d'isoler cette espèce.Grâce à l'étude de 14 échantillons de selles différents par culturomics, nous avons cultivé 520 espèces bactériennes différentes, dont 57 nouvelles espèces bactériennes et 260 espèces décrites pour la première fois à partir du microbiote digestif. Après cette phase de description, les études suivantes pourront tenter d'établir un lien entre les nouvelles espèces bactériennes et le statut clinique des patients étudiés. / Relationships between gut microbiota and human health have been already suggested thanks to metagenomics studies. Microbial culturomics is based on the use of a large number of culture conditions with a rapid identification method by MALDI-TOF or by 16SrRNA amplification and sequencing for the unidentified colonies. The seminal study allowed to identify 340 different bacteria including 31 new bacterial species, 174 bacterial species first described from the human gut. The genome sequencing of each new species allowed to describe the largest genome for a human bacteria (Microvirga massiliensis; 9,3 Mb) and to generate approximately 10,000 previously unknown genes (ORFans) facilitating the future metagenomics studies. In parallel, pyrosequencing performed on the 3 same samples revealed a dramatic low overlapping between the 2 methods with only 51 species detected. In addition, culturomics has demonstrated its superiority than pyrosequencing of a stool from a patient treated for a XDR-tuberculosis. Conversely, the pyrosequencing performed on 2 stool samples of patients treated by antibiotics revealed from 45 to 80% of sequences assigned to Verrucomicrobia although the culturomics study of these same samples did not allowed to culture this species.Currently, thanks to the study of 14 different stool samples by culturomics, we have cultured 520 different bacterial species including 57 new bacterial species and 260 species first described from human gut. After this comprehensive description phase, the following studies will attempt to make a link between new bacterial species and clinical status of the patients studied.
39

Análise multigênica de rotavírus do grupo A em suínos / Multigenic analysis of porcine group A rotavirus

Silva, Fernanda Dornelas Florentino 15 March 2016 (has links)
Os rotavírus do grupo A (RVA) são importantes causadores de diarreias virais em crianças e animais jovens de diferentes espécies, com impactos na saúde pública e animal. Visando contribuir para o entendimento e prevenção das rotaviroses assim como suas possíveis relações zoonóticas, caracterizou-se os 11 segmentos de dsRNA de rotavírus codificadores das proteínas estruturais e não estruturais presentes em amostras fecais positivas de suínos coletadas nos anos de 2012-2013, em 2 estados brasileiros. Mediante o emprego de RT-PCR, sequenciamento nucleotídico e análises filogenéticas, todos os segmentos genéticos oriundos de 12 amostras de RVA detectados em suínos foram analisados e comparados com os de outras amostras descritas previamente. As sequências obtidas para os genes codificadores das proteínas NSP2, NSP3 e VP6 contemplaram a open reading frame (ORF) completa do gene, enquanto que a ORF parcial foi determinada para os genes codificadores das proteínas VP1, VP2, VP3, VP4, VP7, NSP1, NSP4, NSP5 e NSP6. Os genotipos de rotavírus suíno provenientes das regiões amostradas concordam com os mais frequentemente descritos nesta espécie animal, apresentando, assim, uma matriz genética suína com a maioria dos segmentos pertencentes à constelação genotípica 1, com exceção dos genes codificadores das proteínas VP6 e NSP1, os quais foram os genotipos I5 e A8, respectivamente. Apesar de predominar o genotipo 1 (Wa-like) nas sequências deste estudo, a análise genômica sugere a existência de uma variação intragenotípica no genoma do rotavírus do grupo A atualmente circulante nas populações suína amostradas dos estados de São Paulo e Mato Grosso. Adicionalmente, buscou-se identificar os aminoácidos relacionados com a adaptação dos rotavírus no hospedeiro e assinaturas genéticas que distinguissem RVA suíno e humano. Para isso, as sequências obtidas neste estudo foram comparadas com outras cepas de RVA detectadas nestas duas espécies e pertencentes ao genotipo 1 (Wa-like) disponíveis no Genbank. Como resultados foram encontrados mais de 75 sítios de mudanças deaminoácidos que diferenciam RVA suíno e humano além de sítios de substituiçãopresentes em algumas proteínas virais que frequentemente covariaram entre elas. Estes resultados proporcionam um maior entendimento da diversidade viral circulante em unidades de produção suína e uma melhor compreensão dos animaiscomo reservatórios genéticos de cepas de rotavírus emergentes em humanos. / Group A rotaviruses (RVA) are leading causes of viral diarrhea in children and in the young of many animals species with impacts on public and animal health. To contribute to the understanding and prevention of rotaviruses as well as its possible zoonotic relationships, it was characterized the 11 segments of dsRNA rotavirus encoding the structural and nonstructural proteins present in positive fecal samples from pigs collected in the years 2012-2013 in 2 Brazilian states. Using RT-PCR, nucleotide sequencing, and phylogenetic analyses, all gene segments from 12 RVA samples detected in pigs were analyzed and compared with the other samples as described previously. The sequences obtained for the NSP2, NSP3, and VP6 coding genes covered the complete open reading frame (ORF), while the partial ORF was determined for the VP1, VP2, VP3, VP4, VP7, NSP1, NSP4, NSP5 and NSP6 coding genes. The genotypes of porcine rotavirus from the sampled regions agree with the most frequently reported in this species, presenting thus a porcine-RVA-like backbone with most segments being designated as constellation genotype 1, with the exception of the VP6 and NSP1 coding genes, which were genotypes I5 and A8, respectively. Although genotype 1 (Wa-like) sequences were predominant in this study, the genomic analysis suggests the existence of a intragenotypic variation in group A rotavirus genome currently circulating in swine populations sampled in the states of São Paulo and Mato Grosso. In addition, we sought to identify the amino acids related to the adaptation of rotavirus in the host and genetic signatures that distinguish RVA pig and human. For this, the sequences obtained in this study were compared with other strains of RVA detected in these two species, belonging to genotype 1 (Wa-like) available in Genbank. The following results were found more than 75 sites of amino acid changes that differentiate RVA pig and human as well as substitution sites present in some viral proteins that often covaried between them. These results provide a greater understanding of the current viral diversity in swine production units and a better understanding of animals as genetic reservoirs emerging rotavirus strains in humans.
40

Expanding the horizons of next generation sequencing with RUFUS

Farrell, Andrew R. January 2014 (has links)
Thesis advisor: Gabor T. Marth / To help improve the analysis of forward genetic screens, we have developed an efficient and automated pipeline for mutational profiling using our reference guided tools including MOSAIK and FREEBAYES. Studies using next generation sequencing technologies currently employ either reference guided alignment or de novo assembly to analyze the massive amount of short read data produced by second generation sequencing technologies; the far more common approach being reference guided alignment due to the massive computational and sequencing costs associated with de novo assembly. The success of reference guided alignment is dependent on three factors; the accuracy of the reference, the ability of the mapper to correctly place a read, and the degree to which a variant allele differs from the reference. Reference assemblies are not perfect and none are entirely complete. Moreover, read mappers can only map reads in genomic locations that are unique enough to confidently place reads; paralogous sections, such as related gene families, cannot be characterized and are often ignored. Further, variant alleles that drastically alter the subject's DNA, such as insertions or deletions (INDELs), will not map to the reference and are either entirely missed or require further downstream analysis to characterize. Most importantly, reference guided methods are restricted to organisms for which such reference genomes have been assembled. The current alternative, de novo assembly of a genome, is prohibitively expensive for most labs requiring deep read coverage from numerous different library preparations as well as massive computing power. To address the shortcomings of current methods, while eliminating the costs intrinsic to de novo sequence assembly, we developed RUFUS, a novel, completely reference-independent variant discovery tool. RUFUS directly compares raw sequence data from two or more samples and identifies groups of reads unique to one or the other sample. RUFUS has at least the same variant detection sensitivity as mapping methods, with greatly increased specificity for SNPs and INDEL variation events. RUFUS is also capable of extremely sensitive copy number detection, without any restriction on event length. By modeling the underlying k-mer distribution, RUFUS produces a specific copy number spectrum for each individual sample. Applying a Bayesian detection method to detect changes in k-mer content between two samples, RUFUS produces copy number calls that are equally as sensitive as traditional copy number detection methods with far fewer false positives. Our data suggest that RUFUS' reference-free approach to variant discovery is able to substantially improve upon existing variant detection methods: reducing reference biases, reducing false positive variants, and detecting copy number variants with excellent sensitivity and specificity. / Thesis (PhD) — Boston College, 2014. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology.

Page generated in 0.0946 seconds