Spelling suggestions: "subject:"genomik"" "subject:"ekonomik""
11 |
Polymorphisms in heat shock protein receptors / CD91 and LOX-1 Polymorphisms / Polymorphismen von Hitzeschockproteinrezeptoren / CD91 und LOX-1 PolymorphismenMuppala, Vijayakumar 30 October 2008 (has links)
No description available.
|
12 |
Expanding the repertoire of bacterial (non-)coding RNAsFindeiß, Sven 02 May 2011 (has links) (PDF)
The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration.
Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene.
The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes.
The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes.
Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes.
Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions.
RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz.
During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins.
In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World.
|
13 |
Insights into the Evolution of small nucleolar RNAs: Prediction, Comparison, AnnotationCanzler, Sebastian 16 January 2017 (has links)
Over the last decades, the formerly irrevocable believe that proteins are the only key-factors in the complex regulatory machinery of a cell was crushed by a plethora of findings in all major eukaryotic lineages. These suggested a rugged landscape in the eukaryotic genome consist- ing of sequential, overlapping, or even bi-directional transcripts and myriads of regulatory elements. The vast part of the genome is indeed transcribed into an RNA intermediate, but solely a small fraction is finally translated into functional proteins. The sweeping majority, however, is either degraded or functions as a non-protein coding RNA (ncRNA).
Due to continuous developments in experimental and computational research, the variety of ncRNA classes grew larger and larger, ranging from key-processes in the cellular lifespan to regulatory processes that are driven and guided by ncRNAs. The bioinformatical part pri- marily concentrates on the prediction, annotation, and extraction of characteristic properties of novel ncRNAs. Due to conservation of sequence and/or structure, this task is often deter- mined by an homology-search that utilizes information about functional, and hence conserved regions, as an indicator.
This thesis focuses mainly on a special class of ncRNAs, small nucleolar RNAs (snoRNAs). These abundant molecules are mainly responsible for the guidance of 2’-O-ribose-methylations and pseudouridylations in different types of RNAs, such as ribosomal and spliceosomal RNAs. Although the relevance of single modifications is still rather unclear, the elimination of a bunch of modifications is shown to cause severe effects, including lethality.
Several de novo prediction programs have been published over the last years and a substantial amount of publicly available snoRNA databases has originated. Normally, these are restricted to a small amount of species and a collection of experimentally extracted snoRNA. The detection of snoRNAs by means of wet lab experiments and/or de novo prediction tools is generally time consuming (wet lab) and a quite tedious task (identification of snoRNA-specific characteristics).
The snoRNA annotation pipeline snoStrip was developed with the intention to circumvent these obstacles. It therefore utilizes a homology-based search procedure to reliably predict snoRNA genes in genomic sequences. In a subsequent step, all candidates are filtered with respect to specific sequence motifs and secondary structures. In a functional analysis, poten- tial target sites are predicted in ribosomal and spliceosomal RNA sequences. In contrast to de novo prediction tools, snoStrip focuses on the extension of the known snoRNA world to uncharted organisms and the mapping and unification of the existing diversity of snoRNAs into functional, homologous families.
The pipeline is properly suited to analyze a manifold set of organisms in search for their snoRNAome in short timescales. This offers the opportunity to generate large scale analyses over whole eukaryotic kingdoms to gain insights into the evolutionary history of these spe- cial ncRNA molecules. A set of experimentally validated snoRNA genes in Deuterostomia and Fungi were starting points for highly comprehensive surveys searching and analyzing the snoRNA repertoire in these two major eukaryotic clades. In both cases, the snoStrip pipeline proved itself as a fast and reliable tool and collected thousands of snoRNA genes in nearly 200 organisms. Additionally, the Interaction Conservation Index (ICI), which is am- plified to additionally work on single lineages, provides a convenient measure to analyze and evaluate the conservation of snoRNA-targetRNA interactions across different species. The massive amount of data and the possibility to score the conservation of predicted interactions constitute the main pillars to gain an extraordinary insight into the evolutionary history of snoRNAs on both the sequence and the functional level. A substantial part of the snoR- NAome is traceable down to the root of both eukaryotic lineages and might indicate an even more ancient origin of these snoRNAs. However, a plenitude of lineage specific innovation and deletion events are also discernible. Due to its automated detection of homologous and functionally related snoRNA sequences, snoStrip identified extraordinary target switches in fungi. These unveiled a coupled evolutionary history of several snoRNA families that were previously thought to be independent. Although these findings are exceedingly interesting, the broad majority of snoRNA families is found to show remarkable conservation of the se- quence and the predicted target interactions.
On two occasions, this thesis will shift its focus from a genuine snoRNA inspection to an analysis of introns. Both investigations, however, are still conducted under an evolutionary viewpoint. In case of the ubiquitously present U3 snoRNA, functional genes in a notable amount of fungi are found to be disrupted by U2-dependent introns. The set of previously known U3 genes is considerably enlarged by an adapted snoStrip-search procedure. Intron- disrupted genes are found in several fungal lineages, while their precise insertion points within the snoRNA-precursor are located in a small and homologous region. A potential targetRNA of snoRNA genes, U6 snRNA, is also found to contain intronic sequences. Within this work, U6 genes are detected and annotated in nearly all fungal organisms. Although a few U6 intron- carrying genes have been known before, the widespread of these findings and the diversity regarding the particular insertion points are surprising. Those U6 genes are commonly found to contain more than just one intron. In both cases of intron-disrupted non-coding RNA genes, the detected RNA molecules seem to be functional and the intronic sequences show remarkable sequence conservation for both their splice sites and the branch site.
In summary, the snoStrip pipeline is shown to be a reliable and fast prediction tool that works on homology-based search principles. Large scale analyses on whole eukaryotic lineages become feasible on short notice. Furthermore, the automated detection of functionally related but not yet mapped snoRNA families adds a new layer of information. Based on surveys covering the evolutionary history of Fungi and Deuterostomia, profound insights into the evolutionary history of this ncRNA class are revealed suggesting ancient origin for a main part of the snoRNAome. Lineage specific innovation and deletion events are also found to occur at a large number of distinct timepoints.
|
14 |
Predicting patient-specific outcome based on machine learning algorithms using genomic data of patients with locally advanced head and neck squamous cell carcinomaSchmidt, Stefan 09 December 2019 (has links)
Aufgrund der heterogenen Tumorbiologie variiert der Therapieerfolg bei lokal fortgeschrittenen Plattenepithelkarzinomen stark, woraus ein mittleres 5-Jahres-Überleben dieser Patienten von etwa 50% resultiert. Um die Therapie besser an die Tumoreigenschaften anzupassen, muss die Therapieresistenz der Tumoren vor der Behandlung bestimmt werden. In dieser Dissertationsschrift werden Methoden aus dem Bereich des maschinellen Lernens angewandt um Genexpressionsdaten zu analysieren, um so Signaturen und Modelle zu erzeugen die eine Klassifizierung der Tumoren in verschiedene Risikogruppen bezüglich der loko-regionären Tumorkontrolle erlauben. Für Patienten, die mit postoperativer Radiochemotherapie behandelt wurden, konnte eine 7-Gen Signatur entwickelt und erfolgreich validiert werden. Außerdem konnte gezeigt werden, dass verschiedene Signaturen ähnlich gut zur Patientenklassifizierung geeignet sein können. Daher wurde eine Methode vorgeschlagen, die es erlaubt verschiedene prognostiche Modelle zu kombinieren. Weiterhin wurden verschiedene genbasierte Biomarker zwischen verschiedenen Genexpressionsmessmethoden verglichen. In den resultierenden Patienteneinteilungen zeigten Biomarker, die auf Signaturen basieren, eine geringere Variabilität als Biomarker, die auf einzelnen Genen basieren.:Abbreviations VII
Figures IX
Tables XII
1 Introduction 1
2 Biological & Statistical Background 4
2.1 Head and Neck Squamous Cell Carcinoma 4
2.1.1 Tumorigenesis 4
2.1.2 Biomarkers 8
2.2 Statistics 14
2.2.1 Survival analysis 14
2.2.2 Model and data evaluation 18
2.2.3 Data sampling methods 22
2.3 Machine learning algorithms 23
2.3.1 Feature selection algorithms 24
2.3.2 Prognostic models 27
2.4 Gene expression measurement methods 30
2.4.1 Real-time polymerase chain reaction (RT-PCR) 31
2.4.2 nCounter® gene expression 32
2.4.3 In situ-synthesized oligonucleotide microarrays 32
3 Material and methods 35
3.1 Patient cohorts 35
3.1.1 Primary radiochemotherapy (pRCTx) cohorts 35
3.1.2 Postoperative radio(chemo)therapy (PORT-C) cohorts 36
3.1.3 Clinical endpoints 38
3.2 Gene expression analyses 39
3.2.1 HPV status 39
3.2.2 Immunohistochemical staining 39
3.2.3 RT-PCR measurements 40
3.2.4 nCounter® measurements 40
3.2.5 GeneChip® analyses (only training cohorts) 41
3.3 Machine learning framework 41
3.3.1 Pre-processing of gene expression data 41
3.3.2 Determination of the ensemble gene signature 42
3.3.3 Expanding the ensemble signature by highly correlated genes 43
3.3.4 Independent validation and patient stratification 45
4 Identification of gene expression signatures as prognostic biomarkers 46
4.1 Hypoxia classification 46
4.2 nCounter® gene expression based signatures 50
4.2.1 Patients treated with primary radiochemotherapy 50
4.2.2 Clinical Features 55
4.2.3 Signature extension using clinical features 64
4.2.4 Patients treated with postoperative radiochemotherapy 65
4.2.5 Signature extension using clinical features – Port-C 72
4.3 GeneChip® gene expression-based signatures 78
4.3.1 Pre-selection 78
4.3.2 Patients treated with primary radiochemotherapy 79
4.3.3 Patients treated with postoperative radiochemotherapy 87
4.4 Combined models for PORT-C 91
4.4.1 Creation of a consensus model 92
4.4.2 Consensus model based on 2 models 93
4.4.3 Consensus model based on more than 2 models 97
4.4.4 Discussion and summary of model combination 101
5 Stability of gene expression-based biomarkers 102
5.1 Reproducibility depending on time of nCounter® 102
5.2 Comparison of nCounter® and GeneChip® gene expression 106
5.2.1 Introduction 106
5.2.2 Correlation analyses 106
5.2.3 Model and biomarker transfer 108
6 Conclusion and outlook 123
Zusammenfassung 125
Summary 128
Appendix 130
A. Supplementary Figures 130
B. Supplementary Tables 133
Bibliography 147
Acknowledgements 188
Erklärungen 189 / Due to heterogeneous tumour biology, the treatment response of locally advanced head and neck squamous cell carcinoma differs largely between patients, resulting in a mean 5-year survival of about 50%. In order to adapt the treatment to the properties of the tumour, the therapy resistance of the tumours must be assessed before treatment. In this thesis, gene expression data were analysed to identify novel gene signatures and models that allow for stratifying patients into risk groups with low and high risk of loco-regional tumour recurrence. To identify those signatures, methods from the field of machine learning were applied. For patients treated with postoperative radiochemotherapy, a 7-gene signature was developed and successfully validated. Furthermore, it was shown that several models based on different gene signatures may be equally suitable for patient stratification. A method is presented that combines those distinct prognostic models. In addition, gene-expression-based biomarkers were transferred between different gene expressions measurement methods with the result that signatures showed less variability in patient stratification than single-gene biomarkers.:Abbreviations VII
Figures IX
Tables XII
1 Introduction 1
2 Biological & Statistical Background 4
2.1 Head and Neck Squamous Cell Carcinoma 4
2.1.1 Tumorigenesis 4
2.1.2 Biomarkers 8
2.2 Statistics 14
2.2.1 Survival analysis 14
2.2.2 Model and data evaluation 18
2.2.3 Data sampling methods 22
2.3 Machine learning algorithms 23
2.3.1 Feature selection algorithms 24
2.3.2 Prognostic models 27
2.4 Gene expression measurement methods 30
2.4.1 Real-time polymerase chain reaction (RT-PCR) 31
2.4.2 nCounter® gene expression 32
2.4.3 In situ-synthesized oligonucleotide microarrays 32
3 Material and methods 35
3.1 Patient cohorts 35
3.1.1 Primary radiochemotherapy (pRCTx) cohorts 35
3.1.2 Postoperative radio(chemo)therapy (PORT-C) cohorts 36
3.1.3 Clinical endpoints 38
3.2 Gene expression analyses 39
3.2.1 HPV status 39
3.2.2 Immunohistochemical staining 39
3.2.3 RT-PCR measurements 40
3.2.4 nCounter® measurements 40
3.2.5 GeneChip® analyses (only training cohorts) 41
3.3 Machine learning framework 41
3.3.1 Pre-processing of gene expression data 41
3.3.2 Determination of the ensemble gene signature 42
3.3.3 Expanding the ensemble signature by highly correlated genes 43
3.3.4 Independent validation and patient stratification 45
4 Identification of gene expression signatures as prognostic biomarkers 46
4.1 Hypoxia classification 46
4.2 nCounter® gene expression based signatures 50
4.2.1 Patients treated with primary radiochemotherapy 50
4.2.2 Clinical Features 55
4.2.3 Signature extension using clinical features 64
4.2.4 Patients treated with postoperative radiochemotherapy 65
4.2.5 Signature extension using clinical features – Port-C 72
4.3 GeneChip® gene expression-based signatures 78
4.3.1 Pre-selection 78
4.3.2 Patients treated with primary radiochemotherapy 79
4.3.3 Patients treated with postoperative radiochemotherapy 87
4.4 Combined models for PORT-C 91
4.4.1 Creation of a consensus model 92
4.4.2 Consensus model based on 2 models 93
4.4.3 Consensus model based on more than 2 models 97
4.4.4 Discussion and summary of model combination 101
5 Stability of gene expression-based biomarkers 102
5.1 Reproducibility depending on time of nCounter® 102
5.2 Comparison of nCounter® and GeneChip® gene expression 106
5.2.1 Introduction 106
5.2.2 Correlation analyses 106
5.2.3 Model and biomarker transfer 108
6 Conclusion and outlook 123
Zusammenfassung 125
Summary 128
Appendix 130
A. Supplementary Figures 130
B. Supplementary Tables 133
Bibliography 147
Acknowledgements 188
Erklärungen 189
|
15 |
A toolkit for visualization of patterns of gene expression in live Drosophila embryosEjsmont, Radoslaw 03 September 2011 (has links)
Developing biological systems can be approximately described as complex, three dimensional cellular assemblies that change dramatically across time as a consequence of cell proliferation, differentiation and movements. The presented project aims to overcome problems of limited resolution in both space and time of classical analysis by in situ hybridization on fixed tissue. The employment of the newly developed Single Plane Illumination Microscopy (SPIM) combined with new approaches for in vivo data acquisition and processing promise to yield high-resolution four-dimensional data of the complete Drosophila embryogenesis. We developed a toolkit for high-throughput gene engineering in flies, that provides means for creating faithful in vivo reporters of gene expression during Drosophila melanogaster development. The cornerstone of the toolkit is a fosmid genomic library enabling high-throughput recombineering and φC31 mediated site-specific transgenesis. The dominant, 3xP3-dsRed fly selectable marker on the fosmid backbone allows, in principle, transgenesis of the fosmid clones into any non-melanogaster species. In order to extend the capabilities of the gene engineering toolkit to include “evo-devo” studies, we generated genomic fosmid libraries for other sequenced Drosophilidae: D. virilis, D.simulans and D. pseudoobscura. The libraries for these species were constructed in the pFlyFos vector allowing for recombineering modification and φC31 transgenesis of non-melanogaster genomic loci into D. melanogaster. We have developed a PCR pooling strategy to identify clones for a specific gene from the libraries without extensive clone sequencing and mapping. The clones from these libraries will be primarily used for cross-species gene expression studies. As another application, transgenes originating from closely related species can be used to rescue D. melanogaster RNAi phenotypes and establish their specificity. Together with SPIM microscopy, the toolkit will allow to visualize gene expression patterns throughout Drosophila development.
|
16 |
A novel approach to infer orthologs and produce gene annotations at scaleKirilenko, Bogdan 21 October 2022 (has links)
Aufgrund von Fortschritten im Bereich der DNA-Sequenzierung hat die Anzahl verfügbarer Genome in den letzten Jahrzehnten rapide zugenommen. Tausende bereits heute zur Verfügung stehende Genome ermöglichen detaillierte vergleichende Analysen, welche für die Beantwortung relevanter Fragestellungen essentiell sind. Dies betrifft die Assoziation von Genotyp und Phänotyp, die Erforschung der Besonderheiten komplexer Proteine und die Weiterentwicklung medizinischer Anwendungen. Um all diese Fragen zu beantworten ist es notwendig, proteinkodierende Gene in neu sequenzierten Genomen zu annotieren und ihre Homologieverhältnisse zu bestimmen. Die bestehenden Methoden der Genomanalyse sind jedoch nicht für Menge heutzutage anfallender Datenmengen ausgelegt. Daher ist die zentrale Herausforderung in der vergleichenden Genomik nicht die Anzahl der verfügbaren Genome, sondern die Entwicklung neuer Methoden zur Datenanalyse im Hochdurchsatz. Um diese Probleme zu adressieren, schlage ich ein neues Paradigma der Annotation von Genomen und der Inferenz von Homologieverhältnissen vor, welches auf dem Alignment gesamter Genome basiert. Während die derzeit angewendeten Methoden zur Gen-Annotation und Bestimmung der Homologie ausschließlich auf codierenden Sequenzen beruhen, könnten durch die Einbeziehung des umgebenden neutral evolvierenden genomischen Kontextes bessere und vollständigere Annotationen vorgenommen werden. Die Verwendung von Genom-Alignments ermöglicht eine beliebige Skalierung der vorgeschlagenen Methodik auf Tausende Genome. In dieser Arbeit stelle ich TOGA (Tool to infer Orthologs from Genome Alignments) vor, eine bioinformatische Methode, welche dieses Konzept implementiert und Homologie- Klassifizierung und Gen-Annotation in einer einzelnen Pipeline kombiniert. TOGA verwendet Machine-Learning, um Orthologe von Paralogen basierend auf dem Alignment von intronischer und intergener Regionen zu unterscheiden.
Die Ergebnisse des Benchmarkings zeigen, dass TOGA die herkömmlichen Ansätze innerhalb der Placentalia übertrifft. TOGA klassifiziert Homologieverhältnisse mit hoher Präzision und identifiziert zuverlässig inaktivierte Gene als solchet. Frühere Versionen von TOGA fanden in mehreren Studien Anwendung und wurden in zwei Publikationen verwendet. Außerdem wurde TOGA erfolgreich zur Annotation von 500 Säugetiergeenomen verwendet, dies ist der bisher umfangreichste solche Datensatz. Diese Ergebnisse zeigen, dass TOGA das Potenzial hat, sich zu einer etablierten Methode zur Gen-Annotation zu entwickeln und die derzeit angewandten Techniken zu ergänzen.
|
17 |
Assessment of complex microbial assemblages: description of their diversity and characterisation of individual members: Assessment of complex microbial assemblages: description of their diversity and characterisation of individual membersMühling, Martin 23 January 2017 (has links)
1. Microbial ecology
According to Caumette et al. (2015) the term ecology is derived from the Greek words “oikos” (the house and its operation) and “logos” (the word, knowledge or discourse) and can, therefore, be defined as the scientific field engaged in the “knowledge of the laws governing the house”. This, in extension, results in the simple conclusion that microbial ecology represents the study of the relationship between microorganisms, their co-occurring biota and the prevailing environmental conditions (Caumette et al. 2015).
The term microbial ecology has been in use since the early 1960s (Caumette et al. 2015) and microbial ecologists have made astonishing discoveries since. Microbial life at extremes such as in the hydrothermal vents (see Dubilier et al. 2008 and references therein) or the abundance of picophytoplankton (Waterbury et al. 1979; Chisholm et al. 1988) in the deep and surface waters of the oceans, respectively, are only a few of many highlights. Nevertheless, a microbial ecologist who, after leaving the field early in their career, now intends to return would hardly recognise again their former scientific field. The main reason for this hypothesis is to be found in the advances made to the methodologies employed in the field. Most of these were developed for biomedical research and were subsequently hijacked, sometimes followed by minor modifications, by microbial ecologists.
The Author presents in this thesis scientific findings which, although spanning only a fraction of the era of research into microbial ecology, have been obtained using various modern tools of the trade. These studies were undertaken by the Author during his employment as postdoctoral scientist at Warwick University (UK), as member of staff at Plymouth Marine Laboratory (UK) and as scientist at the TU Bergakademie Freiberg. Although the scientific issues and the environmental habitats investigated by the Author changed due to funding constraints or due to change of work place (i.e. from the marine to the mining environment) the research shared, by and large, a common aim: to further the existing understanding of microbial communities. The methodological approach chosen to achieve this aim employed both isolation followed by the characterisation of microorganisms and culture independent techniques. Both of these strategies utilised again a variety of methods, but techniques in molecular biology represent a common theme. In particular, the polymerase chain reaction (PCR) formed the work horse for much of the research since it has been routinely used for the amplification of a marker gene for strain identification or analysis of the microbial diversity. To achieve this, the amplicons were either directly sequenced by the Sanger approach or analysed via the application of genetic fingerprint techniques or through Sanger sequencing of individual amplicons cloned into a heterologous host. However, the Author did not remain at idle while with these ‘classical’ approaches for the analysis of microbial communities, but utilised the advances made in the development of nucleotide sequence analysis. In particular, the highly parallelised sequencing techniques (e.g. 454 pyrosequencing, Illumina sequencing) offered the chance to obtain both high genetic resolution of the microbial diversity present in a sample and identification of many individuals through sequence comparison with appropriate sequence repositories. Moreover, these next generation sequencing (NGS) techniques also provided a cost-effective opportunity to extent the characterisation of microbial strains to non-clonal cultures and to even complex microbial assemblages (metagenomics).
The work involving the high throughput sequencing techniques has been undertaken in collaboration with Dr Jack Gilbert (PML, lateron at Argonne National Laboratory, USA) and, since at Freiberg, with Dr Anja Poehlein (Goettingen University). These colleagues are thanked for their support with sequence data handling and analyses.
|
18 |
A recombineering pipeline for functional genomics applied to Caenorhabditis elegansSarov, Mihail 11 December 2006 (has links)
Genome sequencing and annotation projects define the complete sets of RNA and protein components for living systems. They also present the challenge to generate functional information for thousands of previously uncharacterized genes. Protein tagging with fluorescent or affinity tags provides a generic way to describe protein expression and localization patterns and protein-protein interactions. The genome wide application of this approach in Saccharomyces cerevisiae has resulted in a comprehensive picture of the core proteome of a simple, well-studied model system. Extending these studies to more complex, multicellular model organisms, would allow us to place protein function onto a 4 dimensional space-time map, and will improve our understanding of the complex processes of development and differentiation. This will require efficient protein tagging methods and new high performance tags. Here we present a generic protein tagging approach for the model nematode Caenorhabditis elegans. The method is based on recombination mediated DNA engineering of genomic BAC clones into tagged transgenes for integrative transformation. C.elegans offers unique advantages for function discovery through protein tagging: compact and a well annotated genome, combined with a simple and well-understood anatomy and pattern of development. However, the methods for protein tagging in C.elegans have so far been inefficient and largely dependent on artificial cDNA based constructs, which can lack important regulatory elements. In contrast, our approach combines the advantages of authentic regulation with a new application of recombineering, which is simple, fast and efficient. For the first time we apply liquid culture cloning for multiple recombineering steps. This is particularly important when high throughput applications are considered, as it offers significant advantages in scale up and automation. We show that the BAC derived transgenes can be used for stable, integrative transformation in C. elegans. We show that the tagged transgene can take over the function of its endogenous counterpart. Using florescent reporter, we reproduce known and document new expression patterns. The second part of the thesis describes a project that we undertook to develop improved double affinity cassettes for protein purification. We evaluated the performance of 5 new double tag combinations in vitro and in mammalian culture cells. All of the new cassettes performed well and present a valuable tool for protein interaction studies in higher model systems.
|
19 |
The long and the short of computational ncRNA predictionRose, Dominic 12 November 2010 (has links) (PDF)
Non-coding RNAs (ncRNAs) are transcripts that function directly as RNA molecule without ever being translated to protein.
The transcriptional output of eukaryotic cells is diverse, pervasive, and multi-layered. It consists of spliced as well as unspliced transcripts of both protein-coding messenger RNAs and functional ncRNAs. However, it also contains degradable non-functional by-products and artefacts - certainly a reason why ncRNAs have long been wrongly disposed as transcriptional noise.
Today, RNA-controlled regulatory processes are broadly recognized for a variety of ncRNA classes. The thermoresponsive ROSE ncRNA (repression of heat shock gene expression) is only one example of a regulatory ncRNA acting at the post-transcriptional level via conformational changes of its secondary structure.
Bioinformatics helps to identify novel ncRNAs in the bulk of genomic and transcriptomic sequence data which are produced at ever increasing rates. However, ncRNA annotation is unfortunately not part of generic genome annotation pipelines. Dedicated computational searches for particular ncRNAs are veritable research projects in their own right. Despite best efforts, ncRNAs across the animal phylogeny remain to a large extent uncharted territory.
This thesis describes a comprehensive collection of exploratory bioinformatic field studies designed to de novo predict ncRNA genes in a series of computational screens and in a multitude of newly sequenced genomes. Non-coding RNAs can be divided into subclasses (families) according to peculiar functional, structural, or compositional similarities. A simple but eligible and frequently applied criterion to classify RNA species is length. In line, the thesis is structured into two parts: We present a series of pilot-studies investigating (1) the short and (2) the long ncRNA repertoire of several model species by means of state-of-the-art bioinformatic techniques.
In the first part of the thesis, we focus on the detection of short ncRNAs exhibiting thermodynamically stable and evolutionary conserved secondary structures. We provide evidence for the presence of short structured ncRNAs in a variety of different species, ranging from bacteria to insects and higher eukaryotes. In particular, we highlight drawbacks and opportunities of RNAz-based ncRNA prediction at several hitherto scarcely investigated scenarios, as for example ncRNA prediction in the light of whole genome duplications. A recent microarray study provides experimental evidence for our approach. Differential expression of at least one-sixth of our drosophilid RNAz predictions has been reported. Beyond the means of RNAz, we moreover manually compile sophisticated annotation of short ncRNAs in schistosomes. Obviously, accumulating knowledge about the genetic material of malaria causing parasites which infect millions of humans world-wide is of utmost scientific interest.
Since the performance of any comparative genomics approach is limited by the quality of its input alignments, we introduce a novel light-weight and performant genome-wide alignment approach: NcDNAlign. Although the tool is optimized for speed rather than sensitivity and requires only a minor fraction of CPU time compared to existing programs, we demonstrate that it is basically as sensitive and specific as competing approaches when applied to genome-wide ncRNA gene finding and analysis of ultra-conserved regions.
By design, however, prediction approaches that search for regions with an excess of mutations that maintain secondary structure motifs will miss ncRNAs that are unstructured or whose structure is not well conserved in evolution.
In the second part of the thesis, we therefore overcome secondary structure prediction and, based on splice site detection, develop novel strategies specifically designed to identify long ncRNAs in genomic sequences - probably the open problem in current RNA research. We perform splice site anchored gene-finding in drosophilids, nematodes, and vertebrate genomes and, at least for a subset of obtained candidate genes, provide experimental evidence for expression and the existence of novel spliced transcripts undoubtedly confirming our approach.
In summary, we found evidence for a large number of previously undescribed RNAs which consolidates the idea of non-coding RNAs as an abundant class of regulatory active transcripts. Certainly, ncRNA prediction is a complex task. This thesis, however, rationally advises how to unveil the RNA complement of newly sequenced genomes. Since our results have already established both subsequent computational as well as experimental studies, we believe to have enduringly stimulated the field of RNA research and to have contributed to an enriched view on the subject.
|
20 |
A toolkit for visualization of patterns of gene expression in live Drosophila embryosEjsmont, Radoslaw 07 April 2011 (has links) (PDF)
Developing biological systems can be approximately described as complex, three dimensional cellular assemblies that change dramatically across time as a consequence of cell proliferation, differentiation and movements. The presented project aims to overcome problems of limited resolution in both space and time of classical analysis by in situ hybridization on fixed tissue. The employment of the newly developed Single Plane Illumination Microscopy (SPIM) combined with new approaches for in vivo data acquisition and processing promise to yield high-resolution four-dimensional data of the complete Drosophila embryogenesis. We developed a toolkit for high-throughput gene engineering in flies, that provides means for creating faithful in vivo reporters of gene expression during Drosophila melanogaster development. The cornerstone of the toolkit is a fosmid genomic library enabling high-throughput recombineering and φC31 mediated site-specific transgenesis. The dominant, 3xP3-dsRed fly selectable marker on the fosmid backbone allows, in principle, transgenesis of the fosmid clones into any non-melanogaster species. In order to extend the capabilities of the gene engineering toolkit to include “evo-devo” studies, we generated genomic fosmid libraries for other sequenced Drosophilidae: D. virilis, D.simulans and D. pseudoobscura. The libraries for these species were constructed in the pFlyFos vector allowing for recombineering modification and φC31 transgenesis of non-melanogaster genomic loci into D. melanogaster. We have developed a PCR pooling strategy to identify clones for a specific gene from the libraries without extensive clone sequencing and mapping. The clones from these libraries will be primarily used for cross-species gene expression studies. As another application, transgenes originating from closely related species can be used to rescue D. melanogaster RNAi phenotypes and establish their specificity. Together with SPIM microscopy, the toolkit will allow to visualize gene expression patterns throughout Drosophila development.
|
Page generated in 0.0569 seconds