Spelling suggestions: "subject:"genomic analysis"" "subject:"enomic analysis""
1 |
Analysis of genomic regions bound and regulated by Ataxin-3 / Analysis of genomic regions bound and regulated by Ataxin-3Svoreň, Martin January 2017 (has links)
Charles University Faculty of Pharmacy in Hradec Králové Department of Pharmacology and Toxicology Student: Martin Svoreň Supervisor: PharmDr. Martina Čečková, Ph.D. Specialized supervisor: PD Dr. Bernd Evert Title of diploma thesis: Analysis of genomic regions bound and regulated by Ataxin-3 Spinocerebellar ataxia type 3 (SCA3), also known as Machado-Joseph disease, is a dominantly inherited neurodegenerative disease. In SCA3, the disease protein ataxin-3 (ATXN3) contains an abnormally long polyglutamine (polyQ) tract encoded by CAG repeat expansion. ATXN3 binds DNA and interacts with transcriptional regulators pointing toward a direct role of ATXN3 in transcription. It is conceivable that mutant ATXN3 triggers multiple, interconnected pathogenic cascades leading to neurotoxicity, however, the principal molecular pathomechanism remains elusive. Here, PCR analyses of 16 ATXN3-bound genomic regions recently identified by next generation sequencing of immunoprecipitated ATXN3-bound chromatin fragments confirmed enriched binding of ATXN3 to 5 genomic regions next to genes encoding CCAAT/enhancer binding protein delta (CEBPD), period circadian clock-2 (PER2), phosphatase and tensin homolog (PTEN), serine protease inhibitor family F2 (SERPINF2) and thrombospondin-1 (THBS1). To investigate putative...
|
2 |
Populačně-genomická analýza generalistického parazita - tasemnice \kur{Ligula intestinalis}KOČOVÁ, Pavlína January 2018 (has links)
New insight to the population structure of the tapeworm Ligula intestinalis were obtained using genomic data by represented genomoc methods (ddRAD and high-throughput sequencing)
|
3 |
Quantitative genomic analysis of agroclimatic traits in sorghumOlatoye, Olalere Marcus January 1900 (has links)
Doctor of Philosophy / Department of Agronomy / Geoffrey Morris / Climate change has been anticipated to affect agriculture, with most the profound effect in regions where low input agriculture is being practiced. Understanding of how plants evolved in adaptation to diverse climatic conditions in the presence of local stressors (biotic and abiotic) can be beneficial for improved crop adaptation and yield to ensure food security. Great genetic diversity exists for agroclimatic adaptation in sorghum (Sorghum bicolor L. Moench) but much of it has not been characterized. Thus, limiting its utilization in crop improvement. The application of next-generation sequencing has opened the plant genome for analysis to identify patterns of genome-wide nucleotide variations underlying agroclimatic adaptation.
To understand the genetic basis of adaptive traits in sorghum, the genetic architecture of sorghum inflorescence traits was characterized in the first study. Phenotypic data were obtained from multi-environment experiments and used to perform joint linkage and genome-wide association mapping. Mapping results identified previously mapped and novel genetic loci underlying inflorescence morphology in sorghum. Inflorescence traits were found to be under the control of a few large and many moderate and minor effect loci. To demonstrate how our understanding of the genetic basis of adaptive traits can facilitate genomic enabled breeding, genomic prediction analysis was performed with results showing high prediction accuracies for inflorescence traits.
In the second study, the sorghum-nested association mapping (NAM) population was used to characterize the genetic architecture of leaf erectness, leaf width, and stem diameter. About 2200 recombinant inbred lines were phenotyped in multiple environments. The obtained phenotypic data was used to perform joint linkage mapping using ~93,000 markers. The proportion of phenotypic variation explained by QTL and their allele frequencies were estimated. Common and moderate effects QTL were found to underlie marker-trait associations. Furthermore, identified QTL co-localized with genes involved in both vegetative and inflorescence development. Our results provide insights into the genetic basis of leaf erectness and stem diameter in sorghum. The identified QTL will also facilitate the development of genomic-enable breeding tools for crop improvement and molecular characterization of the underlying genes
Finally, in a third study, 607 Nigerian accessions were genotyped and the resulting genomic data [about 190,000 single nucleotide polymorphisms (SNPs)] was used for downstream analysis. Genome-wide scans of selection and genome-wide association studies (GWAS) were performed and alongside estimates of levels of genetic differentiation and genetic diversity. Results showed that phenotypic variation in the diverse germplasm had been shaped by local adaptation across climatic gradient and can provide plant genetic resources for crop improvement.
|
4 |
Digital Phenotyping and Genomic Prediction Using Machine and Deep Learning in Animals and PlantsBi, Ye 03 October 2024 (has links)
This dissertation investigates the utility of deep learning and machine learning approaches for livestock management and quantitative genetic modeling of rice grain size under climate change. Monitoring the live body weight of animals is crucial to support farm management decisions due to its direct relationship with animal growth, nutritional status, and health. However, conventional manual weighing methods are time consuming and can cause potential stress to animals. While there is a growing trend towards the use of three-dimensional cameras coupled with computer vision techniques to predict animal body weight, their validation with deep learning models as well as large-scale data collected in commercial environments is still limited. Therefore, the first two research chapters show how deep learning-based computer vision systems can enable accurate live body weight prediction for dairy cattle and pigs. These studies also address the challenges of managing large, complex phenotypic data and highlight the potential of deep learning models to automate data processing and improve prediction accuracy in an industry-scale commercial setting. The dissertation then shifts the focus to crop resilience, particularly in rice, where the asymmetric increase in average nighttime temperatures relative to the increase in average daytime temperatures due to climate change is reducing grain yield and quality in rice. Through the use of deep learning and machine learning models, the last two chapters explore how metabolic data can be used in quantitative genetic modeling in rice under environmental stress conditions such as high night temperatures. These studies showed that the integration of metabolites and genomics provided an improvement in the prediction of rice grain size-related traits, and certain metabolites were identified as potential candidates for improving multi-trait genomic prediction. Further research showed that metabolic accumulation was low to moderately heritable, and genomic prediction accuracies were consistent with expected genomic heritability estimates. Genomic correlations between control and high night temperature conditions indicated genotype-by-environment interactions in metabolic accumulation and the effectiveness of genomic prediction models for metabolic accumulation varied across metabolites. Joint analysis of multiple metabolites improved the accuracy of genomic prediction by exploiting correlations between metabolite accumulation. Overall, this dissertation highlights the potential of integrating digital technologies and multi-omic data to advance data analytics in agriculture, with applications in livestock management and quantitative genetic modeling of rice. / Doctor of Philosophy / This dissertation explores the application of deep learning and machine learning to computer vision-based livestock management and quantitative genetic modeling of rice grain size under climate change. The first half of the research chapters illustrate how computer vision systems can enable digital phenotyping of dairy cows and pigs, which is critical for informed management decisions and quantitative genetic analysis. These studies address the challenges of managing large-scale, complex phenotypic data and highlight the potential of deep learning models to automate data processing and improve prediction accuracy. Chapter 3 showed that a deep learning-based segmentation, Mask R-CNN, improved the prediction performance of cow body weight from longitudinal depth video data. Among the image features, volume followed by width correlated best with body weight. Chapter 4 showed that efficient deep learning-based supervised learning models are a promising approach for predicting pig body weight from industry-scale depth video data. Although the sparse design, which simulates budget and time constraints by using a subset of the data for training, resulted in some performance loss compared to the full design, the Vision Transformer models effectively mitigated this loss. The second half of the research chapters focuses on integrating metabolomic and genomic data to predict grain traits and metabolic content in rice under climate change. Through the use of machine learning models, these studies investigate how combining genomic and metabolic data can improve predictions, particularly under high night temperature stress in rice. Chapter 5 showed that the integration of metabolites and genomics improved the prediction of rice grain size-related traits, and certain metabolites were identified as potential candidates for improving multi-trait genomic prediction. Chapter 6 showed that metabolic accumulation was low to moderately heritable. Genomic correlations between control and high night temperature conditions indicated genotype-by-environment interactions in metabolic accumulation, and the effectiveness of genomic prediction models for metabolic accumulation varied across metabolites. Joint analysis of multiple metabolites improved the accuracy of genomic prediction by exploiting correlations between metabolite accumulation. Overall, the dissertation provides insight into how cutting-edge methods can be used to improve livestock management and multi-omic quantitative genetic modeling for breeding.
|
5 |
A Novel Approach For Cancer Characterization Using Latent Dirichlet Allocation and Disease-Specific Genomic AnalysisYalamanchili, Hima Bindu 05 June 2018 (has links)
No description available.
|
6 |
Caracterização e análise comparativa de genomas de estirpes de Leptospira isoladas no Brasil / Characterization and comparative genomic analysis of Leptospira strains isolated in BrazilMoreno, Luisa Zanolli 20 April 2017 (has links)
O presente estudo teve como objetivo caracterizar o genoma de estirpes de Leptospira isoladas no Brasil e realizar a análise comparativa destes com os genomas disponíveis no banco de dados GenBank. Foram caracterizadas 17 estirpes isoladas de distintas espécies animais, em diferentes regiões do Brasil, no período de 1998 a 2012. Estas foram previamente tipificadas por sequenciamento do gene 16S rRNA e soroaglutinção microscópica em seis espécies (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii e L. noguchii) e mais de oito sorogrupos. Foi realizado o sequenciamento em plataforma Illumina™ MiSeq e montagem dos genomas com algoritmo ab initio. Para ordenação e anotação foram utilizados genomas de referência das respectivas espécies estudadas. Foi realizada a análise in silico da Tipagem por Sequenciamento de Multilocus (MLST) para os três protocolos vigentes de Leptospira. A análise comparativa dos genomas, incluindo wgSNP, foi realizada intra-espécie avaliando as variações existentes entre os sorogrupos das espécies de Leptospira estudadas. As estirpes de L. interrogans apresentaram resultados na MLST congruentes com a sua identificação prévia. No caso de L. kirschneri, apenas uma estirpe apresentou novos alelos nos três protocolos de MLST e se distancia das demais estirpes brasileiras de L. kirschneri. As estirpes de L. santarosai, assim como as de L. borgpetersenii e L. noguchii, possuem novos alelos e/ou perfis alélicos para pelo menos dois dos protocolos vigentes de MLST, sendo que ainda se destacam em um agrupamento próprio de origem brasileira. Os genomas de L. interrogans apesar de apresentarem alta identidade e sintenia com a referência sorovar Copenhageni, também apresentaram regiões de diferença entre os respectivos sorogrupos. Os genomas dos sorogrupos Australis e Serjoe se destacaram por apresentarem inserções e deleções, respectivamente, principalmente no cromossomo 2. O genoma de L. borgpetersenii também apresentou grande variação de composição, como esperado para espécie, sendo esta proporcionada por sequências de inserção e transposição de elementos móveis. Os sorogrupos Canicola e Pomona apresentaram maior proximidade entre si na análise wgSNP. Também foram identificados dois plasmídeos nos genomas do sorogrupo Canicola com alta identidade aos plasmídeos descritos na estirpe chinesa do mesmo sorovar. Na espécie L. kirschneri, a estirpe 47 (M36/05) apresentou alta identidade e sintenia com os genomas do sorovar Mozdok, como esperado, incluindo a estirpe brasileira de origem humana. Já a estirpe 55 (M110/06) se diferenciou dos demais genomas de L. kirschneri tanto no MLST quanto no wgSNP. O genoma brasileiro de L. inadai apresentou alta identidade à referência americana de origem humana, incluindo a presença de um bacteriófago próprio da espécie. A distinção das estirpes brasileiras de L. santarosai na MLST, também foi evidenciada na análise comparativa e no wgSNP, sendo que a estirpe 68 (M52/8-19), que não apresentou reatividade aos sorogrupos testados, ainda se diferencia das demais reafirmando a possibilidade de novo sorogrupo/sorovar. Dessa forma, o estudo genômico possibilitou a identificação de particularidades das estirpes brasileiras de Leptospira, incluindo a existência de elementos extra-cromossomais, proximidade com estirpes de origem humana indicando maior risco para saúde pública, além da possibilidade de novo sorogrupo de L. santarosai. / The present study aimed to characterize the genome of Leptospira strains isolated in Brazil and to perform their comparative analysis with GenBank available genomes. 17 strains isolated from distinct species, in different regions of Brazil, from 1998 to 2012 were characterized. These were previously typified through 16S rRNA sequencing and microscopic agglutination into six species (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii and L. noguchii) and over eight serogroups. Illumina™ MiSeq sequencing and genome assembly with ab initio algorithm were performed. For ordering and annotation, reference genomes of the respective species were used. The in silico analysis of Multilocus Sequencing Typing (MLST) was performed for the three current Leptospira protocols. The comparative genomic analysis, including wgSNP, was performed intra-species evaluating the existing variations between the serogroups of the studied Leptospira species. The L. interrogans strains presented MLST results congruent with their previous identification. In the case of L. kirschneri, only one strain presented new alleles in the three MLST protocols and distanced itself from the other Brazilian L. kirschneri strains. The L. santarosai strains, as well as L. borgpetersenii and L. noguchii, presented new alleles and/or allelic profiles for at least two of the current MLST protocols, and still stand out in a separate group of Brazilian origin. Even though the L. interrogans genomes presented high identity and synteny with serovar Copenhageni reference, they also presented regions of difference between the respective serogroups. Serogroups Australis and Serjoe genomes stood out for having insertions and deletions, respectively, mainly in chromosome 2. The L. borgpetersenii genome also presented great variation of composition, as expected for the species, which is provided by insertion sequences and transposition of mobile elements. The serogroups Canicola and Pomona presented higher proximity in the wgSNP analysis. Two plasmids were also identified in the serogroup Canicola genomes with high identity to the plasmids described in the Chinese strain of the same serovar. In the L. kirschneri species, the strain 47 (M36/05) presented high identity and synteny with the serovar Mozdok genomes, as expected, including the Brazilian strain of human origin. The strain 55 (M110/06) differed from other L. kirschneri genomes in both MLST and wgSNP. The Brazilian L. inadai genome presented high identity to the American reference of human origin including the presence of bacteriophage specific for the species. The distinction of the Brazilian L. santarosai strains in the MLST was also evidenced in the comparative analysis and in the wgSNP, and the strain 68 (M52 / 8-19), which showed no reactivity to the tested serogroups, also differs from the others reaffirming the possibility of a new serogroup/serovar. Therefore, the genomic study allowed the identification of particularities of Brazilian Leptospira strains, including the existence of extrachromosomal elements, proximity to strains of human origin indicating a greater risk for public health, in addition to the possibility of a new L. santarosai serogroup.
|
7 |
Algorithm Optimizations in Genomic Analysis Using Entropic DissectionDanks, Jacob R. 08 1900 (has links)
In recent years, the collection of genomic data has skyrocketed and databases of genomic data are growing at a faster rate than ever before. Although many computational methods have been developed to interpret these data, they tend to struggle to process the ever increasing file sizes that are being produced and fail to take advantage of the advances in multi-core processors by using parallel processing. In some instances, loss of accuracy has been a necessary trade off to allow faster computation of the data. This thesis discusses one such algorithm that has been developed and how changes were made to allow larger input file sizes and reduce the time required to achieve a result without sacrificing accuracy. An information entropy based algorithm was used as a basis to demonstrate these techniques. The algorithm dissects the distinctive patterns underlying genomic data efficiently requiring no a priori knowledge, and thus is applicable in a variety of biological research applications. This research describes how parallel processing and object-oriented programming techniques were used to process larger files in less time and achieve a more accurate result from the algorithm. Through object oriented techniques, the maximum allowable input file size was significantly increased from 200 mb to 2000 mb. Using parallel processing techniques allowed the program to finish processing data in less than half the time of the sequential version. The accuracy of the algorithm was improved by reducing data loss throughout the algorithm. Finally, adding user-friendly options enabled the program to use requests more effectively and further customize the logic used within the algorithm.
|
8 |
Comparative analysis of Klebsiella pneumoniae belonging to the endemic high-risk clonal group CG258 / Análise comparativa de Klebsiella pneumoniae multiresistente pertencente ao grupo clonal endêmico de alto risco GC258Cerdeira, Louise Teixeira 28 May 2019 (has links)
The rapid spread of carbapenem-resistant lineages of Klebsiella pneumoniae, clustered within the clonal group CG258, is a growing public health problem associated with healthcareassociated infections. The objective of this study was to perform a genomic analysis of KPC-2 and/or CTX-M β-lactamase-producing strains of K. pneumoniae belonging to CG258 (ST11, ST258, ST340, ST437) circulating at the human-animal-environment interface, in Brazil and South America. The analysis was conducted to characterize the antimicrobial resistome, virulome, genetic elements of transfer and mobilization associated with the dissemination of the blaKPC-2 gene, and to perform a detailed comparative genomic analysis of the CG258; with subsequent pathogenicity evaluation in an invertebrate (Galleria mellonella) model of infection, aiming to identify biomarkers of virulence. The main results are presented in the format of six manuscripts. Manuscript I: New draft genome sequence of a Klebsiella pneumoniae strain 1194/11, belonging to ST340, showing a wide resisto-me. Manuscript II: The first draft genome sequence of a Klebsiella pneumoniae 606B ST340 carrying blaCTX-M-15 in food-producing animal isolated in Brazil. Manuscript III: The first draft genome sequence of a Klebsiella pneumoniae strain Kp171, recovered from a water sample collected in an urban river in Brazil, demonstrating that anthropogenic activities, including the release of wastewater and sewage from hospitals, may have contributed to the contamination of aquatic environments, raising a concern to public health. Manuscript IV: Identification and complete sequence analysis of an IncX3 plasmid carrying a non-Tn4401 genetic element (NTEKPC-Ic), originating from a hospital associated lineage of K. pneumoniae ST340, showing the spread of blaKPC-2 in new Incompatibility group. Manuscript V: Dissemination of blaKPC-2 in novel non-Tn4401 Element (NTEKPC-IId) carry by new small IncQ1 and Col-Like plasmids in lineages of Klebsiella pneumoniae ST11 and ST340. Manuscript VI: Yersiniabactin, colibactin and wider resistome contribute to enhanced virulence and persistence of KPC-2-producing K. pneumoniae CG258 in South America. The results obtained in the present study allow us to obtain a first genomic landscape of K. pneumoniae lineages of the CG258, circulating at the human-animal-environment interface, in Brazil and South America. In this regard, most likely the interplay of yersiniabactin and/or colibactin, and resistance to clinically significant antibiotics (as carbapenems and polymyxins) are contributing to the emergence of highly virulent and MDR lineages that pose great risk to human health. On the other hand, the wide antimicrobial resistome (antibiotics, disinfectants and heavy metals) could be contributing to adaptation of KPC-2- and/or CTX-M-producing K. pneumoniae CG258 in the human-animal-environment interface, highlighting the urgent need for enhanced control efforts. In conclusion, these findings could contribute to the development of strategies for prevention, diagnosis and treatment of K. pneumoniae infections. / A rápida disseminação de linhagens de Klebsiella pneumoniae resistentes aos carbapenêmicos, agrupadas dentro do grupo clonal GC258, e um crescente problema de saúde pública associado com infecções relacionadas a assistência a saúde. O objetivo deste estudo foi realizar uma análise genômica de cepas de K. pneumoniae produtoras de β-lactamases KPC-2 e/ou CTX-M, pertencentes ao GC258 (ST11, ST258, ST340, ST437), circulando na interface humana-ambiente-animal, no Brasil e na América do Sul. A análise foi direcionada para caracterizar o resistoma e viruloma, elementos genéticos de transferência e mobilização associados com a disseminação de genes blaKPC-2, e realizar uma análise de genômica comparativa detalhada do GC258, com posterior avaliação da patogenicidade em modelo invertebrado (Galleria mellonella) de infecção, visando identificar biomarcadores de virulência. Os principais resultados são apresentados na forma de seis manuscritos. Manuscrito I: Nova sequência \"draft\" do genoma de K. pneumoniae 1194/11isolado de amostra clínica, pertencente ao ST340, mostrando um amplo resistoma. Manuscrito II: O reporte da primeira sequência \"draft\" do genoma de K. pneumoniae 606B (ST340), contendo blaCTX-M-15 em animais de produção isolados no Brasil. Manuscrito III: O primeiro esboço da sequência do genoma de K. pneumoniae Kp171, recuperado de uma amostra de água coletada em um rio urbano no Brasil, demonstrando que atividades antrópicas, incluindo a liberação de esgoto e esgoto de hospitais, podem ter contribuído para a contaminação ambientes aquáticos, levantando uma preocupação para a saúde pública. Manuscripto IV: Identificação e análise de sequencia completa de um plasmídeo IncX3 portador de um elemento genético não Tn4401 (NTEKPC-Ic), originado de uma linhagem hospitalar associada a K. pneumoniae ST340, mostrando a disseminação de blaKPC-2 no novo grupo Incompatibilidade. Manuscrito V: Disseminação de blaKPC-2 no novo elemento non-Tn4401 (NTEKPC-IId) portado por novos pequenos plasmídeos IncQ1 e Col-Like em linhagens de K. pneumoniae ST11 e ST340. Manuscrito VI: Os resultados obtidos no presente estudo permitem gerar um panorama genômico das linhagens de K. pneumoniae do GC258, circulando na interface humana-animal-ambiente, no Brasil e na América do Sul. De principal interesse, a convergência da virulência associada com genes codificando yersiniabactina e/ou a colibactina e a resistência a antibióticos clinicamente significativos (como carbapenemicos e polimixinas), estão contribuindo para o aparecimento de linhagens altamente virulentas e multirresistentes que apresentam um grande risco a saúde humana. Por outro lado, a ampla resistência aos antimicrobiana (antibióticos, desinfetantes e metais pesados) poderia estar contribuindo para a adaptação de estirpes de K. pneumoniae do GC258, produtoras de KPC-2- e/ou CTX-M, na interface humana-ambiente-animal, destacando a necessidade urgente de medidas para o controle de disseminação. Em conclusão, esses achados poderiam contribuir para o desenvolvimento de estratégias de prevenção, diagnóstico e tratamento das infecções por K. pneumoniae.
|
9 |
Desenvolvimento de um pipeline para análise genômica e transcriptômica com base em Web servicesMelo, Henrique Velloso Ferreira 04 September 2009 (has links)
Made available in DSpace on 2016-08-17T18:39:30Z (GMT). No. of bitstreams: 1
2590.pdf: 1752867 bytes, checksum: 7dd2196f0a9d489b35a0759a6cc018c6 (MD5)
Previous issue date: 2009-09-04 / Pipeline systems for genomic and transcriptomic analysis aim to create communication bridges among the existing analysis tools, therefore reducing researchers efforts. Most of the pipelines found in the literature lack important features which would be useful to the development of genome or transcriptome sequencing projects. Among them, the capacity of tracking the project results along its development, including the generation of partial reports; the presence of a collaborative environment where the involved laboratories can contribute with new data and chromatograms; the possibility to configure analysis parameters; multiple pipeline support and the possibility to include new tools and modules. In this work, a pipeline prototype was developed to overcome these shortcomings. Sequencing projects progresses are tracked along all over their developments. Chromatograms are progressively received along the development of the project and partial reports over newly received data are generated. The communication with the processing server is done via Web service, which offers a universal language interface, allowing client applications in heterogeneous platforms to submit data and execute operations and queries. Pipelines are configured in XML documents written in a predefined format, through which the researchers choose the tools and parameters to be used. The prototype offers support to multiple pipelines executed simultaneously in the same project. Pipelines are executed in parallel by the means of thread pools, what increases efficiency by distributing the workload in multiprocessed systems. Another feature of the prototype is the extensibility as each pipeline step is wrapped in a module. New modules can be easily inserted in the system through the implementation of a programming interface, therefore without the needing of recompilation. Module insertions are done in a declarative way through XML documents. A client application was also developed in the collaborative platform Sakai, allowing different research groups involved in a sequencing project to create pipelines, view results and exchange information on the project current status. To evaluate the efficiency of the prototype, a case study was carried out. Sequences generated from sequencing of Sphenophorus levis transcriptome were submitted and a pipeline was configured to analyze the data. The case study has pointed out that the prototype is efficient and produces good results. / Sistemas de pipeline para análise de genomas e transcriptomas têm o objetivo de criar pontes de comunicação entre as diferentes ferramentas no intuito de reduzir os esforços do pesquisador no processo de análise. A maioria dos pipelines descritos na literatura carece de funcionalidades importantes para o desenvolvimento de projetos de sequenciamento. Entre elas, a capacidade de acompanhar e gerar resultados parciais das análises ao longo do desenvolvimento do projeto; a presença de um ambiente colaborativo onde os diferentes laboratórios envolvidos possam contribuir com novos dados e cromatogramas; a possibilidade da configuração dos parâmetros da análise; o suporte a múltiplos pipelines com diferentes configurações; e o suporte à inclusão de novos programas e módulos. Neste trabalho, foi desenvolvido um protótipo que supre essas deficiências. O progresso dos projetos é acompanhado ao longo de todo o seu desenvolvimento. Para isso, recebe dados brutos de cromatogramas, realiza análises dos dados parciais e emite relatórios com os resultados. A comunicação com o servidor de processamento é realizada via Web service, oferecendo uma interface na linguagem universal XML que permite que aplicações cliente em plataformas heterogêneas submetam dados e realizem operações e consultas. Os pipelines são configurados através de arquivos XML em formato específico, no qual o pesquisador define os programas a parâmetros a utilizar. O protótipo dá suporte a múltiplos pipelines com execução simultânea em um mesmo projeto. A execução dos pipelines é realizada em paralelo por meio de um pool de threads, o que aumenta a eficiência dividindo a carga de processamento em servidores com mais de um núcleo. Uma aplicação cliente foi desenvolvida na plataforma colaborativa, permitindo que os diferentes grupos de pesquisa envolvidos no sequenciamento criem pipelines, visualizem resultados e troquem informações sobre o andamento do projeto. Outro diferencial do protótipo desenvolvido é a extensibilidade. Cada etapa do pipeline é encapsulada em um módulo. Novos módulos podem ser facilmente inseridos sem a necessidade de recompilação de todo o sistema, bastando para isso que o mesmo implemente uma interface específica. A inserção no sistema é realizada declarativamente em arquivos XML. Um estudo de caso foi realizado com a submissão de cromatogramas a partir do sequenciamento de ESTs (Expressed Sequence Tags) de Sphenophorus Levis. Um pipeline foi configurado para o estudo, e sua execução mostrou que o sistema é eficiente e apresenta bons resultados.
|
10 |
Caracterização e análise comparativa de genomas de estirpes de Leptospira isoladas no Brasil / Characterization and comparative genomic analysis of Leptospira strains isolated in BrazilLuisa Zanolli Moreno 20 April 2017 (has links)
O presente estudo teve como objetivo caracterizar o genoma de estirpes de Leptospira isoladas no Brasil e realizar a análise comparativa destes com os genomas disponíveis no banco de dados GenBank. Foram caracterizadas 17 estirpes isoladas de distintas espécies animais, em diferentes regiões do Brasil, no período de 1998 a 2012. Estas foram previamente tipificadas por sequenciamento do gene 16S rRNA e soroaglutinção microscópica em seis espécies (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii e L. noguchii) e mais de oito sorogrupos. Foi realizado o sequenciamento em plataforma Illumina™ MiSeq e montagem dos genomas com algoritmo ab initio. Para ordenação e anotação foram utilizados genomas de referência das respectivas espécies estudadas. Foi realizada a análise in silico da Tipagem por Sequenciamento de Multilocus (MLST) para os três protocolos vigentes de Leptospira. A análise comparativa dos genomas, incluindo wgSNP, foi realizada intra-espécie avaliando as variações existentes entre os sorogrupos das espécies de Leptospira estudadas. As estirpes de L. interrogans apresentaram resultados na MLST congruentes com a sua identificação prévia. No caso de L. kirschneri, apenas uma estirpe apresentou novos alelos nos três protocolos de MLST e se distancia das demais estirpes brasileiras de L. kirschneri. As estirpes de L. santarosai, assim como as de L. borgpetersenii e L. noguchii, possuem novos alelos e/ou perfis alélicos para pelo menos dois dos protocolos vigentes de MLST, sendo que ainda se destacam em um agrupamento próprio de origem brasileira. Os genomas de L. interrogans apesar de apresentarem alta identidade e sintenia com a referência sorovar Copenhageni, também apresentaram regiões de diferença entre os respectivos sorogrupos. Os genomas dos sorogrupos Australis e Serjoe se destacaram por apresentarem inserções e deleções, respectivamente, principalmente no cromossomo 2. O genoma de L. borgpetersenii também apresentou grande variação de composição, como esperado para espécie, sendo esta proporcionada por sequências de inserção e transposição de elementos móveis. Os sorogrupos Canicola e Pomona apresentaram maior proximidade entre si na análise wgSNP. Também foram identificados dois plasmídeos nos genomas do sorogrupo Canicola com alta identidade aos plasmídeos descritos na estirpe chinesa do mesmo sorovar. Na espécie L. kirschneri, a estirpe 47 (M36/05) apresentou alta identidade e sintenia com os genomas do sorovar Mozdok, como esperado, incluindo a estirpe brasileira de origem humana. Já a estirpe 55 (M110/06) se diferenciou dos demais genomas de L. kirschneri tanto no MLST quanto no wgSNP. O genoma brasileiro de L. inadai apresentou alta identidade à referência americana de origem humana, incluindo a presença de um bacteriófago próprio da espécie. A distinção das estirpes brasileiras de L. santarosai na MLST, também foi evidenciada na análise comparativa e no wgSNP, sendo que a estirpe 68 (M52/8-19), que não apresentou reatividade aos sorogrupos testados, ainda se diferencia das demais reafirmando a possibilidade de novo sorogrupo/sorovar. Dessa forma, o estudo genômico possibilitou a identificação de particularidades das estirpes brasileiras de Leptospira, incluindo a existência de elementos extra-cromossomais, proximidade com estirpes de origem humana indicando maior risco para saúde pública, além da possibilidade de novo sorogrupo de L. santarosai. / The present study aimed to characterize the genome of Leptospira strains isolated in Brazil and to perform their comparative analysis with GenBank available genomes. 17 strains isolated from distinct species, in different regions of Brazil, from 1998 to 2012 were characterized. These were previously typified through 16S rRNA sequencing and microscopic agglutination into six species (L. interrogans, L. santarosai, L. inadai, L. kirschneri, L. borgpetersenii and L. noguchii) and over eight serogroups. Illumina™ MiSeq sequencing and genome assembly with ab initio algorithm were performed. For ordering and annotation, reference genomes of the respective species were used. The in silico analysis of Multilocus Sequencing Typing (MLST) was performed for the three current Leptospira protocols. The comparative genomic analysis, including wgSNP, was performed intra-species evaluating the existing variations between the serogroups of the studied Leptospira species. The L. interrogans strains presented MLST results congruent with their previous identification. In the case of L. kirschneri, only one strain presented new alleles in the three MLST protocols and distanced itself from the other Brazilian L. kirschneri strains. The L. santarosai strains, as well as L. borgpetersenii and L. noguchii, presented new alleles and/or allelic profiles for at least two of the current MLST protocols, and still stand out in a separate group of Brazilian origin. Even though the L. interrogans genomes presented high identity and synteny with serovar Copenhageni reference, they also presented regions of difference between the respective serogroups. Serogroups Australis and Serjoe genomes stood out for having insertions and deletions, respectively, mainly in chromosome 2. The L. borgpetersenii genome also presented great variation of composition, as expected for the species, which is provided by insertion sequences and transposition of mobile elements. The serogroups Canicola and Pomona presented higher proximity in the wgSNP analysis. Two plasmids were also identified in the serogroup Canicola genomes with high identity to the plasmids described in the Chinese strain of the same serovar. In the L. kirschneri species, the strain 47 (M36/05) presented high identity and synteny with the serovar Mozdok genomes, as expected, including the Brazilian strain of human origin. The strain 55 (M110/06) differed from other L. kirschneri genomes in both MLST and wgSNP. The Brazilian L. inadai genome presented high identity to the American reference of human origin including the presence of bacteriophage specific for the species. The distinction of the Brazilian L. santarosai strains in the MLST was also evidenced in the comparative analysis and in the wgSNP, and the strain 68 (M52 / 8-19), which showed no reactivity to the tested serogroups, also differs from the others reaffirming the possibility of a new serogroup/serovar. Therefore, the genomic study allowed the identification of particularities of Brazilian Leptospira strains, including the existence of extrachromosomal elements, proximity to strains of human origin indicating a greater risk for public health, in addition to the possibility of a new L. santarosai serogroup.
|
Page generated in 0.0717 seconds