151 |
Detekce genomových variací / Detection of Genome VariationsBeluský, Tomáš January 2013 (has links)
An influence of variations in human genome is perceptible at a first glance on human itself to see differences between the individuals and entire populations. Also, behavior or probability of certain diseases are influenced in large way by differences at genome's level. This work presents methods for detecting variations in the human genome that were developed after an arose of the second-generation sequencing technologies. A new tool that combines read pair and split read methods, with information about a depth of coverage was also designed and implemented. The tool was tested on simulated and real data and compared with a reference outputs.
|
152 |
Sierra platinum: a fast and robust peak-caller for replicated ChIP-seq experiments with visual quality-control and -steeringMüller, Lydia, Gerighausen, Daniel, Farman, Mariam, Zeckzer, Dirk January 2016 (has links)
Background: Histone modifications play an important role in gene regulation. Their genomic locations are of great interest. Usually, the location is measured by ChIP-seq and analyzed with a peak-caller. Replicated ChIP-seq experiments become more and more available. However, their analysis is based on single-experiment peak-calling or on tools like PePr which allows peak-calling of replicates but whose underlying model might not be suitable for the conditions under which the experiments are performed. Results: We propose a new peak-caller called \"Sierra Platinum\" that allows peak-calling of replicated ChIP-seq
experiments. Moreover, it provides a variety of quality measures together with integrated visualizations supporting the assessment of the replicates and the resulting peaks, as well as steering the peak-calling process. Conclusion: We show that Sierra Platinum outperforms currently available methods using a newly generated benchmark data set and using real data from the NIH Roadmap Epigenomics Project. It is robust against noisy replicates.
|
153 |
Prediction of designer-recombinases for DNA editing with generative deep learningSchmitt, Lukas Theo 17 January 2024 (has links)
Site-specific tyrosine-type recombinases are effective tools for genome engineering, with the first engineered variants having demonstrated therapeutic potential. So far, adaptation to new DNA target site selectivity of designer-recombinases has been achieved mostly through iterative cycles of directed molecular evolution. While effective, directed molecular evolution methods are laborious and time consuming. To accelerate the development of designer-recombinases I evaluated two sequencing approaches and gathered the sequence information of over two million Cre-like recombinase sequences evolved for 89 different target sites. With this information I first investigated the sequence compositions and residue changes of the recombinases to further our understanding of their target site selectivity. The complexity of the data led me to a generative deep learning approach. Using the sequence data I trained a conditional variational autoencoder called RecGen (Recombinase Generator) that is capable of generating novel recombinases for a given target site. With computational evaluation of the sequences I revealed that known recombinases functional on the desired target site are generally more similar to the RecGen predicted recombinases than other recombinase libraries. Additionally, I could experimentally show that predicted recombinases for known target sites are at least as active as the evolved recombinases. Finally, I also experimentally show that 4 out of 10 recombinases predicted for novel target sites are capable of excising their respective target sites. As a bonus to RecGen I also developed a new method capable of accurate sequencing of recombinases with nanopore sequencing while simultaneously counting DNA editing events. The data of this method should enable the next development iteration of RecGen.
|
154 |
Genetics and pathophysiology of coronal craniosynostosis revealed by next-generation DNA sequencingSharma, Vikram Pramod January 2015 (has links)
This thesis further delineates the molecular genetic basis of a relatively common craniofacial condition, coronal craniosynostosis. It used whole-exome sequencing to identify novel disease genes in patients with non-syndromic coronal synostosis and negative genetic testing. Initially, 2 patients were identified with damaging, frameshift mutations in a gene not previously linked with craniosynostosis – Transcription Factor 12 (TCF12). A further intronic mutation was identified in a third patient. This gene encodes a transcription factor that dimerises with TWIST1, mutations of which cause Saethre-Chotzen syndrome, also associated with coronal synostosis. Screening 344 undiagnosed patients identified 35 further mutations, all with coronal synostosis with 14 cases arising de novo. This work was published and testing for TCF12-related craniosynostosis was translated clinically. Significant non-penetrance (60%) was identified in mutation-positive relatives and the genetic background was investigated. Firstly, analysis of parental origins of de novo mutations identified 6 of paternal origin and helped refine haplotype assignment. Secondly, haplotype analysis of TCF12-mutation carriers revealed modest correlation with phenotypic status, but this was insufficient to be useful in clinical testing. Thirdly, TCF12 haplotypes were analysed for association with non-syndromic coronal synostosis, but no significant association was found. Further exome sequencing revealed a de novo frameshift mutation in Transcription Factor 20 (TCF20) in a patient with coronal synostosis and autism, although the mutation only correlated with the latter phenotype. Analysis of 5 trios revealed a novel variant in myosin heavy chain 4 (MYH4) in 1 family, although its role in suture development is uncertain. Reviewing pooled exome data from 19 mutation-negative patients revealed no further disease genes. In summary, this thesis describes novel gene discovery, defines a new clinical entity and investigates genetic background of penetrant and non-penetrant individuals. Further exome sequencing identified another disease gene, a de novo mutation and compiled lists of damaging variants to allow future work.
|
155 |
Approches bioinformatiques pour l'assessment de la biodiversité / Bioinformatics approachs for the biodiversity assesmentRiaz, Tiayyba 23 November 2011 (has links)
Cette thèse s'intéresse à la conception et le développement des techniques de bioinfor- matique qui peuvent faciliter l'utilisation de l'approche metabarcoding pour mesurer la diversité d'espèces. Le metabarcoding peut être utilisé avec le séquencage haut débit pour l'identification d'espèces multiples à partir d'un seul échantillon environnemental. La véritable force du metabarcoding réside dans l'utilisation de barcode marqueurs choisi pour une étude particulière et l'identification d'espèces ou des taxons peut être réalisé avec des marqueurs soigneusement conçu. Avec l'avancement des techniques haut débit de séquençage, une énorme quantité des données de séquences est produit qui contient un nombres substantiel des mutations. Ces mutations posent un grand problème pour les estimations correctes de la biodiversité et pour le d'assignation de taxon. Les trois problèmes majeurs dans le domaine de la bioinformatique que j'ai abordés dans cette thèse sont: i) évaluer la qualité d'une barcode marker , ii) concevoir des nouveaux région barcode et iii) d'analyser les données de séquençage pour traiter les erreurs et éliminer le bruit en séquences. Pour évaluer la qualité d'un barcode marker, on a développé deux mesures quantita- tive,formelle: la couverture (Bc) et la spécificité (Bs). La couverture donne une mesure de universalité d'une pairs de primer pour amplifier un large nombre de taxa, alors que la spécificité donne une mesure de capacité à discriminer entre les différents taxons. Ces mesures sont très utiles pour le classement des barcode marker et pour sélectionner les meilleurs markers. Pour trouver des nouveaux région barcode notamment pour les applications metabarcod- ing, j'ai développé un logiciel, ecoPrimers3. Basé sur ces deux mesures de qualité et de l'information taxinomique intégré, ecoPrimers nous permet de concevoir barcode markers pour n'importe quel niveau taxonomique . En plus, avec un grand nombre de paramètres réglables il nous permet de contrôler les propriétés des amorces. Enfin, grâce a des algorithmes efficaces et programmé en langage C, ecoPrimers est suffisamment efficace pour traiter des grosses bases de données, y compris génomes bactériens entièrement séquencés. Enfin pour traiter des erreurs présentes dans les données de séquencage , nous avons analysé un ensemble simple d'échantillons de PCR obtenus à partir de l'analyse du régime alimentaire de Snow Leopard. En mesurant les corrélations entre les différents paramètres des erreurs, nous avons observé que la plupart des erreurs sont produites pendant l'amplification par PCR. Pour détecter ces erreurs, nous avons développé un algorithme utilisant les graphes, qui peuvent différencier les vrai séquences des erreurs induites par PCR. Les résultats obtenus à partir de cet algorithme a montré que les données de-bruitée a donnent une estimation réaliste de la diversité des espèces étudiées dans les Alpes françaises. / This thesis is concerned with the design and development of bioinformatics techniques that can facilitate the use of metabarcoding approach for measuring species diversity. Metabarcoding coupled with next generation sequencing techniques have a strong po- tential for multiple species identification from a single environmental sample. The real strength of metabarcoding resides in the use of barcode markers chosen for a particular study. The identification at species or higher level taxa can be achieved with carefully designed barcode markers. Moreover with the advent of high throughput sequencing techniques huge amount of sequence data is being produced that contains a substantial level of mutations. These mutations pose a problem for the correct estimates of biodi- versity and for the taxon assignation process. Thus the three major challenges that we addressed in this thesis are: evaluating the quality of a barcode region, designing new barcodes and dealing with errors occurring during different steps of an experiment. To assess the quality of a barcode region we have developed two formal quantitative mea- sures called barcode coverage (Bc) and barcode specificity (Bs). Barcode coverage is concerned with the property of a barcode to amplify a broad range of taxa, whereas barcode specificity deals with its ability to discriminate between different taxa. These measures are extremely useful especially for ranking different barcodes and selecting the best markers. To deal with the challenge of designing new barcodes for metabarcoding applications we have developed an efficient software called ecoPrimers. Based on the above two quality measures and with integrated taxonomic information, ecoPrimers1 enables us to design primers and their corresponding barcode markers for any taxonomic level. Moreover with a large number of tunable parameters it allows us to control the properties of primers. Finally, based on efficient algorithms and implemented in C language, ecoPrimers is efficient enough to deal with large data bases including fully sequenced bacterial genomes. Finally to deal with errors present in DNA sequence data, we have analyzed a simple set of PCR samples obtained from the diet analysis of snow leopard. We grouped closely related sequences and by measuring the correlation between different parameters of mutations, we have shown that most of the errors were introduced during PCR amplification. In order to deal with such errors, we have further developed an algorithm using graphs approach, that can differentiate true sequences from PCR induced errors. The results obtained from this algorithm showed that de-noised data gave a realistic estimate of species diversity studied in French Alpes. This algorithm is implemented in program obiclean.
|
156 |
The Effect of Aluminium Industry Effluents on Sediment Bacterial CommunitiesGill, Hardeep 19 October 2012 (has links)
The goal of this project was to develop novel bacterial biomarkers for use in an industrial context. These biomarkers would be used to determine aluminium industry activity impact on a local ecosystem. Sediment bacterial communities of the Saguenay River are subjected to industrial effluent produced by industry in Jonquière, QC. In-situ responses of these communities to effluent exposure were measured and evaluated as potential biomarker candidates for exposure to past and present effluent discharge. Bacterial community structure and composition between control and affected sites were investigated. Differences observed between the communities were used as indicators of a response to industrial activity through exposure to effluent by-products. Diversity indices were not significantly different between sites with increased effluent exposure. However, differences were observed with the inclusion of algae and cyanobacteria. UniFrac analyses indicated that a control (NNB) and an affected site (Site 2) were more similar to one another with regard to community structure than either was to a medially affected site (Site 5) (Figure 2.4). We did not observe a signature of the microbial community structure that could be predicted with effluent exposure. Microbial community function in relation to bacterial mercury resistance (HgR) was also evaluated as a specific response to the mercury component present in sediments. Novel PCR primers and amplification conditions were developed to amplify merP, merT and merA genes belonging to the mer-operon which confers HgR (Table 5.6). To our knowledge, the roles of merP and merT have not been explored as possible tools to confirm the presence of the operon. HgR gene abundance in sediment microbial communities was significantly correlated (p < 0.05) to total mercury levels (Figure 3.4) but gene expression was not measurable. We could not solely attribute the release of Hg0 from sediments in bioreactor experiments to a biogenic origin. However, there was a 1000 fold difference in measured Hg0 release between control and affected sites suggesting that processes of natural remediation may be taking place at contaminated sites (Figure 3.7). Abundance measurements of HgR related genes represent a strong response target to the mercury immobilized in sediments. Biomarkers built on this response can be used by industry to measure long term effects of industrially derived mercury on local ecosystems. The abundance of mer-operon genes in affected sites indicates the presence of a thriving bacterial community harbouring HgR potential. These communities have the capacity to naturally remediate the sites they occupy. This remediation could be further investigated. Additional studies will be required to develop biomarkers that are more responsive to contemporary industrial activity such as those based on the integrative oxidative stress response.
|
157 |
The Effect of Aluminium Industry Effluents on Sediment Bacterial CommunitiesGill, Hardeep 19 October 2012 (has links)
The goal of this project was to develop novel bacterial biomarkers for use in an industrial context. These biomarkers would be used to determine aluminium industry activity impact on a local ecosystem. Sediment bacterial communities of the Saguenay River are subjected to industrial effluent produced by industry in Jonquière, QC. In-situ responses of these communities to effluent exposure were measured and evaluated as potential biomarker candidates for exposure to past and present effluent discharge. Bacterial community structure and composition between control and affected sites were investigated. Differences observed between the communities were used as indicators of a response to industrial activity through exposure to effluent by-products. Diversity indices were not significantly different between sites with increased effluent exposure. However, differences were observed with the inclusion of algae and cyanobacteria. UniFrac analyses indicated that a control (NNB) and an affected site (Site 2) were more similar to one another with regard to community structure than either was to a medially affected site (Site 5) (Figure 2.4). We did not observe a signature of the microbial community structure that could be predicted with effluent exposure. Microbial community function in relation to bacterial mercury resistance (HgR) was also evaluated as a specific response to the mercury component present in sediments. Novel PCR primers and amplification conditions were developed to amplify merP, merT and merA genes belonging to the mer-operon which confers HgR (Table 5.6). To our knowledge, the roles of merP and merT have not been explored as possible tools to confirm the presence of the operon. HgR gene abundance in sediment microbial communities was significantly correlated (p < 0.05) to total mercury levels (Figure 3.4) but gene expression was not measurable. We could not solely attribute the release of Hg0 from sediments in bioreactor experiments to a biogenic origin. However, there was a 1000 fold difference in measured Hg0 release between control and affected sites suggesting that processes of natural remediation may be taking place at contaminated sites (Figure 3.7). Abundance measurements of HgR related genes represent a strong response target to the mercury immobilized in sediments. Biomarkers built on this response can be used by industry to measure long term effects of industrially derived mercury on local ecosystems. The abundance of mer-operon genes in affected sites indicates the presence of a thriving bacterial community harbouring HgR potential. These communities have the capacity to naturally remediate the sites they occupy. This remediation could be further investigated. Additional studies will be required to develop biomarkers that are more responsive to contemporary industrial activity such as those based on the integrative oxidative stress response.
|
158 |
The Effect of Aluminium Industry Effluents on Sediment Bacterial CommunitiesGill, Hardeep January 2012 (has links)
The goal of this project was to develop novel bacterial biomarkers for use in an industrial context. These biomarkers would be used to determine aluminium industry activity impact on a local ecosystem. Sediment bacterial communities of the Saguenay River are subjected to industrial effluent produced by industry in Jonquière, QC. In-situ responses of these communities to effluent exposure were measured and evaluated as potential biomarker candidates for exposure to past and present effluent discharge. Bacterial community structure and composition between control and affected sites were investigated. Differences observed between the communities were used as indicators of a response to industrial activity through exposure to effluent by-products. Diversity indices were not significantly different between sites with increased effluent exposure. However, differences were observed with the inclusion of algae and cyanobacteria. UniFrac analyses indicated that a control (NNB) and an affected site (Site 2) were more similar to one another with regard to community structure than either was to a medially affected site (Site 5) (Figure 2.4). We did not observe a signature of the microbial community structure that could be predicted with effluent exposure. Microbial community function in relation to bacterial mercury resistance (HgR) was also evaluated as a specific response to the mercury component present in sediments. Novel PCR primers and amplification conditions were developed to amplify merP, merT and merA genes belonging to the mer-operon which confers HgR (Table 5.6). To our knowledge, the roles of merP and merT have not been explored as possible tools to confirm the presence of the operon. HgR gene abundance in sediment microbial communities was significantly correlated (p < 0.05) to total mercury levels (Figure 3.4) but gene expression was not measurable. We could not solely attribute the release of Hg0 from sediments in bioreactor experiments to a biogenic origin. However, there was a 1000 fold difference in measured Hg0 release between control and affected sites suggesting that processes of natural remediation may be taking place at contaminated sites (Figure 3.7). Abundance measurements of HgR related genes represent a strong response target to the mercury immobilized in sediments. Biomarkers built on this response can be used by industry to measure long term effects of industrially derived mercury on local ecosystems. The abundance of mer-operon genes in affected sites indicates the presence of a thriving bacterial community harbouring HgR potential. These communities have the capacity to naturally remediate the sites they occupy. This remediation could be further investigated. Additional studies will be required to develop biomarkers that are more responsive to contemporary industrial activity such as those based on the integrative oxidative stress response.
|
159 |
The Characterisation of Putative Nuclear Pore-Anchoring Proteins in Arabidopsis thalianaCollins, Patrick January 2013 (has links)
The nuclear pore complex (NPC) is perhaps the largest protein complex in the eukaryotic cell, and controls the movement of molecules across the nuclear envelope. The NPC is composed of up to 30 proteins termed nucleoporins (Nups), each grouped in different sub-complexes. The transmembrane ring sub-complex is composed of Nups responsible for anchoring the NPC to the nuclear envelope. Bioinformatic analysis has traced all major sub-complexes of the NPC back to the last eukaryotic common ancestor, meaning that the nuclear pore structure and function is conserved amongst all eukaryotes. In this study Arabidopsis T-DNA knockout lines for these genes were investigated to characterise gene function. Differences in plant growth and development were observed for the ndc1 knockout line compared to wild-type but gp210 plants showed no phenotypic differences. The double knockout line gp210 ndc1 was generated through crosses to observe plant response to the knockout of two anchoring-Nup genes. No synergistic affect from this double knockout was observed, suggesting that more, as yet unidentified Nups function the transmembrane ring in plants. The sensitivity to nuclear export inhibitor leptomycin B (LMB) was tested also for knockout lines, although growth sensitivity to the drug was not observed. Nucleocytoplasmic transport of knockout lines was measured in cells transformed by particle bombardment. To express fluorescent protein constructs actively transported through the NPC, localisation of protein determined the nucleocytoplasmic transport of the cell. The ndc1single knockout and the double knockout gp210 ndc1 exhibited decreased nuclear export. Further experiments in determining NDC1 localisation and identification of other Nups in the transmembrane ring sub-complex would bring a more comprehensive understanding to the plant NPC.
|
Page generated in 0.1098 seconds