Spelling suggestions: "subject:"genomik"" "subject:"ekonomik""
1 |
Einzelzell-basierte Methoden zur Charakterisierung Schwamm-assoziierter Bakterien / Single cell based methods for the characterization of sponge-associated bacteriaSiegl, Alexander January 2009 (has links) (PDF)
Schwämme (Phylum Porifera) sind der älteste rezente Tierstamm der Erde. Insbesondere marine Vertreter dieser sessilen Invertebraten sind oftmals mit einem mikrobiellen Konsortium assoziiert, welches hochgradig wirtsspezifisch und phylogenetisch divers ist. Die Biomasse dieser Mikroflora kann dabei rund die Hälfte der Masse eines Schwamms ausmachen. Die Komplexität des Konsortiums sowie der Mangel an kultivierbaren Vertretern der Schwamm-spezifischen Kladen erschwert dabei eine gezielte funktionelle Charakterisierung. Von besonderem Interesse hierbei ist das exklusiv in marinen Schwämmen vorzufindende Candidatus Phylum Poribacteria, für das bislang kein kultivierter Vertreter vorliegt. Die metabolisch aktiven und hochabundanten Poribakterien liegen in der extrazellulären Matrix des Schwammes vor und zeichnen sich durch das Vorhandensein einer Nukleoid-ähnlichen intrazellulären Struktur aus. Ziel dieser Promotionsarbeit war es, neue Einzelzell-basierte Methoden auf das Gebiet der funktionellen Charakterisierung von Bakterien anzuwenden, welche spezifisch mit dem mediterranen Schwamm Aplysina aerophoba assoziiert sind. Dabei wurden sowohl kultivierungs-abhängige, als auch kultivierungs-unabhängige Versuchsansätze verfolgt. Das Hauptaugenmerk dieser Studien lag dabei auf dem Candidatus Phylum Poribacteria. Während auf dem ‚dilution-to-extinction‘-Prinzip beruhende Hochdurchsatz-Kultivierungen nicht zum Erhalt einer Schwammsymbionten-Reinkultur führten, konnten durch eine Kombination aus FACS-Vereinzelung von Schwamm-assoziierten Bakterien und anschließenden Einzel-Genom-Amplifizierungen (‚whole genome amplifications‘) umfassende Einblicke in die metabolischen Kapazitäten von Schwammsymbionten gewonnen werden. Ferner gelang durch die Anwendung dieser neuen kultivierungs-unabhängigen Methode eine spezifische Verknüpfung von Phylogenie und Funktion Schwamm-assoziierter, nicht-kultivierbarer Bakterien. So konnte im Rahmen dieser Dissertation eine neue nicht-ribosomale Peptidsynthetase (NRPS) einem Vertreter einer Schwamm-spezifischen Chloroflexi-Klade zugewiesen werden. Ferner gelang die Zuordnung einer exklusiv in marinen Schwämmen vorgefundenen Polyketidsynthase (Sup-PKS) zu den Poribacteria. Die Klonierung von hochmolekularer, Einzel-Genom-amplifizierter DNA in Cosmide gewährte zudem Einblicke in den genomischen Kontext dieser, mit dem bakteriellen Sekundärmetabolismus assoziierten Gene. Die Pyrosequenzierung eines amplifizierten, von einem einzelnen Poribakterium abstammenden Genoms führte zudem zum Erhalt von rund zwei Megabasen an genetischer Information über diese Schwammsymbionten. Dadurch wurden detaillierte Informationen über den poribakteriellen Primär- und Sekundärstoffwechsel gewonnen. Die Auswertung der automatisch annotierten 454-Daten erlaubte die Rekonstruktion von Stoffwechselwegen, so z.B. der Glykolyse oder des Citratzyklus und bestätigte das Vorhandensein eines Sup-PKS-Gens im poribakteriellen Genom. Ferner konnten Gemeinsamkeiten mit den Schwesterphyla Planctomycetes, Chlamydiae und Verrucomicrobia gefunden werden. Zudem zeigte die vergleichende Analyse mit einem poribakteriellen Referenzklon aus einer bestehenden Metagenombank die genomische Mikroheterogenität innerhalb dieses Phylums. Nicht zuletzt konnte die Auswertung der poribakteriellen 454-Sequenzierung eine Reihe von möglichen Symbiose-Determinanten aufdecken, die beispielsweise am Austausch von Metaboliten zwischen den Interaktionspartnern beteiligt sind. Die Ergebnisse dieser Dissertationsarbeit stellen die Basis für eine gezielte und detaillierte funktionelle Beschreibung einzelner Bakterien innerhalb komplexer mikrobieller Konsortien dar, wie sie in marinen Schwämmen vorzufinden sind. Dieser Studie gewährte erstmalig umfassende Einblicke in das genomische Potential der nicht-kultivierten, Schwamm-assoziierten Poribacteria. Weiterführende Einzelzell-basierte Experimente werden in Zukunft dazu beitragen, das Bild von der Interaktion zwischen Bakterien und eukaryontischen Wirten zu komplettieren. / Sponges (phylum Porifera) represent the evolutionarily oldest of all extant animal phyla. Especially marine members of these sessile invertebrates are well known to be permanently associated with microbial consortia, which are highly host-specific and phylogenetically diverse. About half of the sponge’s biomass can be made up of this microflora. However, the complexity of the consortia as well as the lack of cultured representatives impedes a directed functional characterization of sponge-specific bacterial phylotypes. Of special interest in this context is the candidate phylum Poribacteria, whose members have so far been exclusively detected in marine sponges. As indicated by the annex ‘candidate’, no cultured representative exists for the Poribacteria. The metabolically active and abundant Poribacteria are located in the sponge extracellular matrix and are characterized by the presence of a nucleoid-like organelle. The aim of this dissertation was the application of novel single cell based methods to the field of sponge microbiology for functional characterization of bacteria specifically associated with the Mediterranean sponge Aplysina aerophoba. For that purpose, cultivation-dependent as well as cultivation-independent approaches were pursued. Particular attention was paid to the candidate phylum Poribacteria. While high-throughput cultivation experiments based on the ‘dilution-to-extinction’ principle did not yield a sponge symbiont in pure culture, extensive insights into the metabolic properties of sponge-associated bacteria were gained by dissecting the microbial consortia using FACS-sorting with subsequent ‘whole genome amplifications’. In addition, this approach enabled a specific linkage between phylogeny and function of sponge-specific, non-culturable bacteria. Within the scope of this PhD thesis a novel non ribosomal peptide synthetase (NRPS) could be assigned to a member of a sponge-specific clade within the phylum Chloroflexi. Moreover, an exclusively in marine sponges existing class of polyketide synthases (Sup-PKS) was shown to be encoded by the Poribacteria. Cosmide-cloning of amplified genomic DNA derived from FACS-sorted sponge microbes provided insights into genes associated with secondary metabolism and adjacent genomic context. Pyrosequencing of a single amplified genome derived from a member of the Poribacteria resulted in almost two megabases of genetic information about this sponge symbiont. Data analysis provided detailed insights into the poribacterial primary and secondary metabolism. Analysis of the automatically annotated 454-data enabled the reconstruction of metabolic pathways like glycolysis and citric acid cycle. Furthermore, the presence of the Sup-PKS gene in the poribacterial genome was confirmed. Moreover, common features with the sister phyla Planctomycetes, Chlamydiae and Verrucomicrobia were traced within the poribacterial data set. Additionally, the comparative study with a poribacterial reference clone from an existing metagenomic library revealed genomic microheterogeneity within the phylum Poribacteria. Last but not least the interpretation of the 454-sequencing approach did expose a set of putative determinants such as metabolite exchange factors required for establishment and maintenance of the symbiosis with the sponge host. The results of this dissertation provide a basis for a directed and detailed functional characterization of single bacteria within complex microbial consortia like they exist in marine sponges. This study provided a comprehensive picture of the genomic potential of the uncultured sponge-associated Poribacteria. Continued single cell based experiments will lead to a better knowledge of the mechanisms of interaction between bacteria and eukaryotic hosts.
|
2 |
Stealth tRNAs: Strategies for mining orthogonal tRNA candidates from genomic dataOhlsson, Ingemar January 2015 (has links)
No description available.
|
3 |
Insights into the Evolution of small nucleolar RNAsCanzler, Sebastian 26 January 2017 (has links) (PDF)
Over the last decades, the formerly irrevocable believe that proteins are the only key-factors in the complex regulatory machinery of a cell was crushed by a plethora of findings in all major eukaryotic lineages. These suggested a rugged landscape in the eukaryotic genome consist- ing of sequential, overlapping, or even bi-directional transcripts and myriads of regulatory elements. The vast part of the genome is indeed transcribed into an RNA intermediate, but solely a small fraction is finally translated into functional proteins. The sweeping majority, however, is either degraded or functions as a non-protein coding RNA (ncRNA).
Due to continuous developments in experimental and computational research, the variety of ncRNA classes grew larger and larger, ranging from key-processes in the cellular lifespan to regulatory processes that are driven and guided by ncRNAs. The bioinformatical part pri- marily concentrates on the prediction, annotation, and extraction of characteristic properties of novel ncRNAs. Due to conservation of sequence and/or structure, this task is often deter- mined by an homology-search that utilizes information about functional, and hence conserved regions, as an indicator.
This thesis focuses mainly on a special class of ncRNAs, small nucleolar RNAs (snoRNAs). These abundant molecules are mainly responsible for the guidance of 2’-O-ribose-methylations and pseudouridylations in different types of RNAs, such as ribosomal and spliceosomal RNAs. Although the relevance of single modifications is still rather unclear, the elimination of a bunch of modifications is shown to cause severe effects, including lethality.
Several de novo prediction programs have been published over the last years and a substantial amount of publicly available snoRNA databases has originated. Normally, these are restricted to a small amount of species and a collection of experimentally extracted snoRNA. The detection of snoRNAs by means of wet lab experiments and/or de novo prediction tools is generally time consuming (wet lab) and a quite tedious task (identification of snoRNA-specific characteristics).
The snoRNA annotation pipeline snoStrip was developed with the intention to circumvent these obstacles. It therefore utilizes a homology-based search procedure to reliably predict snoRNA genes in genomic sequences. In a subsequent step, all candidates are filtered with respect to specific sequence motifs and secondary structures. In a functional analysis, poten- tial target sites are predicted in ribosomal and spliceosomal RNA sequences. In contrast to de novo prediction tools, snoStrip focuses on the extension of the known snoRNA world to uncharted organisms and the mapping and unification of the existing diversity of snoRNAs into functional, homologous families.
The pipeline is properly suited to analyze a manifold set of organisms in search for their snoRNAome in short timescales. This offers the opportunity to generate large scale analyses over whole eukaryotic kingdoms to gain insights into the evolutionary history of these spe- cial ncRNA molecules. A set of experimentally validated snoRNA genes in Deuterostomia and Fungi were starting points for highly comprehensive surveys searching and analyzing the snoRNA repertoire in these two major eukaryotic clades. In both cases, the snoStrip pipeline proved itself as a fast and reliable tool and collected thousands of snoRNA genes in nearly 200 organisms. Additionally, the Interaction Conservation Index (ICI), which is am- plified to additionally work on single lineages, provides a convenient measure to analyze and evaluate the conservation of snoRNA-targetRNA interactions across different species. The massive amount of data and the possibility to score the conservation of predicted interactions constitute the main pillars to gain an extraordinary insight into the evolutionary history of snoRNAs on both the sequence and the functional level. A substantial part of the snoR- NAome is traceable down to the root of both eukaryotic lineages and might indicate an even more ancient origin of these snoRNAs. However, a plenitude of lineage specific innovation and deletion events are also discernible. Due to its automated detection of homologous and functionally related snoRNA sequences, snoStrip identified extraordinary target switches in fungi. These unveiled a coupled evolutionary history of several snoRNA families that were previously thought to be independent. Although these findings are exceedingly interesting, the broad majority of snoRNA families is found to show remarkable conservation of the se- quence and the predicted target interactions.
On two occasions, this thesis will shift its focus from a genuine snoRNA inspection to an analysis of introns. Both investigations, however, are still conducted under an evolutionary viewpoint. In case of the ubiquitously present U3 snoRNA, functional genes in a notable amount of fungi are found to be disrupted by U2-dependent introns. The set of previously known U3 genes is considerably enlarged by an adapted snoStrip-search procedure. Intron- disrupted genes are found in several fungal lineages, while their precise insertion points within the snoRNA-precursor are located in a small and homologous region. A potential targetRNA of snoRNA genes, U6 snRNA, is also found to contain intronic sequences. Within this work, U6 genes are detected and annotated in nearly all fungal organisms. Although a few U6 intron- carrying genes have been known before, the widespread of these findings and the diversity regarding the particular insertion points are surprising. Those U6 genes are commonly found to contain more than just one intron. In both cases of intron-disrupted non-coding RNA genes, the detected RNA molecules seem to be functional and the intronic sequences show remarkable sequence conservation for both their splice sites and the branch site.
In summary, the snoStrip pipeline is shown to be a reliable and fast prediction tool that works on homology-based search principles. Large scale analyses on whole eukaryotic lineages become feasible on short notice. Furthermore, the automated detection of functionally related but not yet mapped snoRNA families adds a new layer of information. Based on surveys covering the evolutionary history of Fungi and Deuterostomia, profound insights into the evolutionary history of this ncRNA class are revealed suggesting ancient origin for a main part of the snoRNAome. Lineage specific innovation and deletion events are also found to occur at a large number of distinct timepoints.
|
4 |
New statistical Methods of Genome-Scale Data Analysis in Life Science - Applications to enterobacterial Diagnostics, Meta-Analysis of Arabidopsis thaliana Gene Expression and functional Sequence Annotation / Neue statistische Methoden für genomweite Datenanalysen in den Biowissenschaften - Anwendungen in der Enterobakteriendiagnostik, Meta-Analyse von Arabidopsis thaliana Genexpression und funktionsbezogenen SequenzannotationFriedrich, Torben January 2009 (has links) (PDF)
Recent progresses and developments in molecular biology provide a wealth of new but insufficiently characterised data. This fund comprises amongst others biological data of genomic DNA, protein sequences, 3-dimensional protein structures as well as profiles of gene expression. In the present work, this information is used to develop new methods for the characterisation and classification of organisms and whole groups of organisms as well as to enhance the automated gain and transfer of information. The first two presented approaches (chapters 4 und 5) focus on the medically and scientifically important enterobacteria. Its impact in medicine and molecular biology is founded in versatile mechanisms of infection, their fundamental function as a commensal inhabitant of the intestinal tract and their use as model organisms as they are easy to cultivate. Despite many studies on single pathogroups with clinical distinguishable pathologies, the genotypic factors that contribute to their diversity are still partially unknown. The comprehensive genome comparison described in Chapter 4 was conducted with numerous enterobacterial strains, which cover nearly the whole range of clinically relevant diversity. The genome comparison constitutes the basis of a characterisation of the enterobacterial gene pool, of a reconstruction of evolutionary processes and of comprehensive analysis of specific protein families in enterobacterial subgroups. Correspondence analysis, which is applied for the first time in this context, yields qualitative statements to bacterial subgroups and the respective, exclusively present protein families. Specific protein families were identified for the three major subgroups of enterobacteria namely the genera Yersinia and Salmonella as well as to the group of Shigella and E. coli by applying statistical tests. In conclusion, the genome comparison-based methods provide new starting points to infer specific genotypic traits of bacterial groups from the transfer of functional annotation. Due to the high medical importance of enterobacterial isolates their classification according to pathogenicity has been in focus of many studies. The microarray technology offers a fast, reproducible and standardisable means of bacterial typing and has been proved in bacterial diagnostics, risk assessment and surveillance. The design of the diagnostic microarray of enterobacteria described in chapter 5 is based on the availability of numerous enterobacterial genome sequences. A novel probe selection strategy based on the highly efficient algorithm of string search, which considers both coding and non-coding regions of genomic DNA, enhances pathogroup detection. This principle reduces the risk of incorrect typing due to restrictions to virulence-associated capture probes. Additional capture probes extend the spectrum of applications of the microarray to simultaneous diagnostic or surveillance of antimicrobial resistance. Comprehensive test hybridisations largely confirm the reliability of the selected capture probes and its ability to robustly classify enterobacterial strains according to pathogenicity. Moreover, the tests constitute the basis of the training of a regression model for the classification of pathogroups and hybridised amounts of DNA. The regression model features a continuous learning capacity leading to an enhancement of the prediction accuracy in the process of its application. A fraction of the capture probes represents intergenic DNA and hence confirms the relevance of the underlying strategy. Interestingly, a large part of the capture probes represents poorly annotated genes suggesting the existence of yet unconsidered factors with importance to the formation of respective virulence phenotypes. Another major field of microarray applications is gene expression analysis. The size of gene expression databases rapidly increased in recent years. Although they provide a wealth of expression data, it remains challenging to integrate results from different studies. In chapter 6 the methodology of an unsupervised meta-analysis of genome-wide A. thaliana gene expression data sets is presented, which yields novel insights in function and regulation of genes. The application of kernel-based principal component analysis in combination with hierarchical clustering identified three major groups of contrasts each sharing overlapping expression profiles. Genes associated with two groups are known to play important roles in Indol-3 acetic acid (IAA) mediated plant growth and development as well as in pathogen defence. Yet uncharacterised serine-threonine kinases could be assigned to novel functions in pathogen defence by meta-analysis. In general, hidden interrelation between genes regulated under different conditions could be unravelled by the described approach. HMMs are applied to the functional characterisation of proteins or the detection of genes in genome sequences. Although HMMs are technically mature and widely applied in computational biology, I demonstrate the methodical optimisation with respect to the modelling accuracy on biological data with various distributions of sequence lengths. The subunits of these models, the states, are associated with a certain holding time being the link to length distributions of represented sequences. An adaptation of simple HMM topologies to bell-shaped length distributions described in chapter 7 was achieved by serial chain-linking of single states, while residing in the class of conventional HMMs. The impact of an optimisation of HMM topologies was underlined by performance evaluations with differently adjusted HMM topologies. In summary, a general methodology was introduced to improve the modelling behaviour of HMMs by topological optimisation with maximum likelihood and a fast and easily implementable moment estimator. Chapter 8 describes the application of HMMs to the prediction of interaction sites in protein domains. As previously demonstrated, these sites are not trivial to predict because of varying degree in conservation of their location and type within the domain family. The prediction of interaction sites in protein domains is achieved by a newly defined HMM topology, which incorporates both sequence and structure information. Posterior decoding is applied to the prediction of interaction sites providing additional information of the probability of an interaction for all sequence positions. The implementation of interaction profile HMMs (ipHMMs) is based on the well established profile HMMs and inherits its known efficiency and sensitivity. The large-scale prediction of interaction sites by ipHMMs explained protein dysfunctions caused by mutations that are associated to inheritable diseases like different types of cancer or muscular dystrophy. As already demonstrated by profile HMMs, the ipHMMs are suitable for large-scale applications. Overall, the HMM-based method enhances the prediction quality of interaction sites and improves the understanding of the molecular background of inheritable diseases. With respect to current and future requirements I provide large-scale solutions for the characterisation of biological data in this work. All described methods feature a highly portable character, which allows for the transfer to related topics or organisms, respectively. Special emphasis was put on the knowledge transfer facilitated by a steadily increasing wealth of biological information. The applied and developed statistical methods largely provide learning capacities and hence benefit from the gain of knowledge resulting in increased prediction accuracies and reliability. / Die aktuellen Fortschritte und Entwicklungen in der Molekularbiologie stellen eine Fülle neuer, bisher kaum analysierter Daten bereit. Dieser Fundus umfasst unter Anderem biologische Daten zu genomischer DNA, zu Proteinsequenzen, zu dreidimensionalen Proteinstrukturen sowie zu Genexpressionsprofilen. In der vorliegenden Arbeit werden diese Informationen genutzt, um neue Methoden der Charakterisierung und Klassifizierung von Organismen bzw. Organismengruppen zu entwickeln und einen automatisierten Informationsgewinn sowie eine Informationsübertragung zu ermöglichen. Die ersten beiden vorgestellten Ansätze (Kapitel 4 und 5) konzentrieren sich auf die medizinisch und wissenschaftlich bedeutsame Gruppe der Enterobakterien. Deren Bedeutung für Medizin und Mikrobiologie geht auf ihre Funktion als kommensale Bewohner des Darmtraktes, ihre Nutzung als leicht kultivierbare Modellorganismen und auf die vielseitigen Infektionsmechanismen zurück. Obwohl bereits viele Studien über einzelne Pathogruppen mit klinisch unterscheidbaren Symptomen existieren, sind die genotypischen Faktoren, die für diese Unterschiedlichkeit verantwortlich zeichnen, teilweise noch nicht bekannt. Der in Kapitel 4 beschriebene umfassende Genomvergleich wurde anhand einer Vielzahl von Enterobakterien durchgeführt, die nahezu die gesamte Bandbreite klinisch relevanter Diversität darstellen. Dieser Genomvergleich bildet die Basis für eine Charakterisierung des enterobakteriellen Genpools, für eine Rekonstruktion evolutionärer Prozesse und Einflüsse und für eine umfassende Untersuchung spezifischer Proteinfamilien in enterobakteriellen Untergruppen. Die in diesem Kontext vorher noch nicht angewandte Korrespondenzanalyse liefert qualitative Aussagen zu bakteriellen Untergruppen und den ausschließlich in ihnen vorkommenden Proteinfamilien. In drei Hauptuntergruppen der Enterobakterien, die den Gattungen Yersinia und Salmonella sowie der Gruppe aus Shigella und E. coli entsprechen, wurden die jeweils spezifischen Proteinfamilien mit Hilfe statistischer Tests identifiziert. Zusammenfassend bilden die auf Genomvergleichen aufbauenden Methoden neue Ansatzpunkte, um aus der Übertragung der bekannten Funktionalität einzelner Proteine auf spezifische, genotypische Besonderheiten bakterieller Gruppen zu schließen. Aufgrund ihrer hohen medizinischen Relevanz war die Typisierung enterobakterieller Isolate entsprechend ihrer Pathogenität Ziel zahlreicher Studien. Die Microarray-Technologie bietet ein schnelles, reproduzierbares und standardisierbares Hilfsmittel für bakterielle Typisierung und hat sich in der Bakteriendiagnostik, Risikobewertung und Überwachung bewährt. Das in Kapitel 5 beschriebene Design eines diagnostischen Microarray beruht auf einer großen Anzahl verfügbarer Genomsequenzen von Enterobakterien. Ein hocheffizienter String-Matching-Algorithmus ist die Grundlage einer neuartigen Strategie der Sondenauswahl, die sowohl kodierende als auch nicht-kodierende Bereiche genomischer DNA berücksichtigt. Im Vergleich zu Diagnostika, die ausschließlich auf Virulenz-assoziierten Sonden beruhen, verringert dieses Prinzip das Risiko einer inkorrekten Typisierung. Zusätzliche Sonden erweitern das Anwendungsspektrum auf eine simultane Diagnostik der Antibiotikaresistenz bzw. eine Überwachung der Resistenzausbreitung. Umfangreiche Testhybridisierungen belegen eine überwiegende Zuverlässigkeit der Sonden und vor allem eine robuste Klassifizierung enterobakterieller Stämme entsprechend der Pathogruppen. Die Tests bilden zudem die Grundlage für das Training eines Regressionsmodells zur Klassifizierung der Pathogruppe und zur Vorhersage der Menge hybridisierter DNA. Das Regressionsmodell zeichnet sich durch kontinuierliche Lernfähigkeit und damit durch eine Verbesserung der Vorhersagequalität im Prozess der Anwendung aus. Ein Teil der Sonden repräsentiert intergenische DNA und bestätigt infolgedessen die Relevanz der zugrunde liegenden Strategie. Die Tatsache, dass ein großer Teil der von den Sonden repräsentierten Gene noch nicht annotiert ist, legt die Existenz bisher unentdeckter Faktoren mit Bedeutung für die Ausbildung entsprechender Virulenz-Phänotypen nahe. Ein weiteres Haupteinsatzgebiet von Microarrays ist die Genexpressionsanalyse. Die Größe von Genexpressionsdatenbanken ist in den vergangenen Jahren stark gewachsen. Obwohl sie eine Fülle von Expressionsdaten bieten, sind Ergebnisse aus unterschiedlichen Studien weiterhin schwer in einen übergreifenden Zusammenhang zu bringen. In Kapitel 6 wird die Methodik einer ausschließlich datenbasierten Meta-Analyse für genomweite A. thaliana Genexpressionsdatensätze dargestellt, die neue Erkenntnisse über Funktion und Regulation von Genen verspricht. Die Anwendung von Kernel-basierter Hauptkomponentenanalyse in Kombination mit hierarchischem Clustering identifizierte drei Hauptgruppen von Kontrastexperimenten mit jeweils überlappenden Expressionsmustern. In zwei Gruppen konnten deregulierte Gene wichtigen Funktionen bei Indol-3-Essigsäure (IAA) vermitteltem Pflanzenwachstum und -entwicklung sowie pflanzlicher Pathogenabwehr zugeordnet werden. Bisher funktionell nicht näher charakterisierte Serin-Threonin-Kinasen wurden über die Meta-Analyse mit der Pathogenabwehr assoziiert. Grundsätzlich kann dieser Ansatz versteckte Wechselbeziehungen zwischen Genen aufdecken, die unter verschiedenen Bedingungen reguliert werden. Bei der funktionellen Charakterisierung von Proteinen oder der Vorhersage von Genen in Genomsequenzen werden Hidden-Markov-Modelle (HMMs) eingesetzt. HMMs sind technisch ausgereift und in der computergestützten Biologie vielfach eingesetzt worden. Trotzdem birgt die Methodik das Potential zur Optimierung bezüglich der Modellierung biologischer Daten, die hinsichtlich der Längenverteilung ihrer Sequenzen variieren. Untereinheiten dieser Modelle, die Zustände, repräsentieren über ihre individuelle Verweildauer zugrunde liegende Verteilungen von Sequenzlängen. Kapitel 7 stellt eine Methode zur Anpassung einfacher HMM-Topologien an biologische Daten, die glockenkurvenartige Längenverteilungen zeigen, vor. Die Modellierung solcher Verteilungen wird dabei durch eine serielle Verkettung vervielfältigter Zustände gewährleistet, ohne dass die Klasse herkömmlicher HMMs verlassen wird. Auswertungen der Modellierungsleistung bei unterschiedlich stark optimierten HMM-Topologien unterstreichen die Bedeutung der entwickelten Topologieoptimierung. Zusammenfassend wird hier eine generelle Methodik beschrieben, die die Modelleigenschaften von HMMs über Topologieoptimierungen verbessert. Die Parameter dieser Optimierung werden mit Hilfe von Maximum-Likelihood und einem leicht einzubindenden Momentschätzer bestimmt. In Kapitel 8 wird die Anwendung von HMMs zur Vorhersage von Interaktionsstellen in Proteindomänen beschrieben. Wie bereits gezeigt wurde, sind solche Stellen aufgrund einer variablen Konserviertheit ihrer Position und ihres Typs schwer zu bestimmen. Eine Vorhersage von Interaktionstellen in Proteindomänen wird über die Definition einer neuen HMM-Topologie erreicht, die sowohl Sequenz- als auch Strukturdaten einbindet. Interaktionsstellen werden mit einem Posterior-Decoding-Algorithmus vorhergesagt, der zusätzliche Informationen über die Wahrscheinlichkeit einer Interaktion für alle Sequenzpositionen bereitstellt. Die Implementierung der Interaktionsprofil-HMMs (ipHMMs) basiert auf den etablierten Profil-HMMs und erbt deren Effizienz und Sensitivität. Eine groß angelegte Vorhersage von Interaktionsstellen mit ipHMMs konnte mutationsbedingte Fehlfunktionen in Proteinen erklären, die mit vererbbaren Krankheiten wie unterschiedlichen Tumortypen oder Muskeldystrophie assoziiert sind. Wie Profile-HMMs sind auch ipHMMs für groß angelegte Anwendungen geeignet. Insgesamt verbessert die HMM-gestützte Methode sowohl die Vorhersagequalität für Interaktionsstellen als auch das Verständnis molekularer Hintergründe bei vererbbaren Krankheiten. Im Hinblick auf aktuelle und zukünftige Anforderungen stelle ich in dieser Arbeit Lösungsansätze für eine umfassende Charakterisierung großer Mengen biologischer Daten vor. Alle beschriebenen Methoden zeichnen sich durch gute Übertragbarkeit auf verwandte Probleme aus. Besonderes Augenmerk wurde dabei auf den Wissenstransfer gelegt, der durch einen stetig wachsenden Fundus biologischer Information ermöglicht wird. Die angewandten und entwickelten statistischen Methoden sind lernfähig und profitieren von diesem Wissenszuwachs, Vorhersagequalität und Zuverlässigkeit der Ergebnisse verbessern sich.
|
5 |
Gene and genome duplication and the evolution of novel gene functionsSteinke, Dirk. January 2006 (has links)
Konstanz, Univ., Diss., 2005.
|
6 |
Building a genomic variant based prediction model for lung cancer toxicity / Konstruktion av en genvartiants-baserad prediktionsmodell för lungcancertoxicitetJanvid, Vincent January 2021 (has links)
Since the completion of the the Human genome project in 2003, the evident complexity of our genome and its regulation has only grown. The idea that having sequenced the human genome would solve this mystery was quickly discarded. With the decreasing costs of DNA sequencing, a plethora of new methods have evolved to further understand the role of non-coding regions of our genome, which makes up 98% its length. Genetic variations in these regions are therefore abundant in the human population, but their e ects are hard to characterize. Many non-coding variants have been linked to complex diseases such as cancer predisposition. This thesis aims to investigate the potential e ects of non-coding variants on drug toxicity, that is, how severe the adverse e ects of a drug are to the treated patients. More specifically it will study the effects of two cancer drugs, Gemcitabine and Carboplatin, on a set of 96 patients with lung cancer. To do this we use spatial data acquired by the promoter-targeting method HiCap as well as expression data obtained from blood cell lines. Using the variants obtained through whole genome sequencing of the patients, a supervised learning approach was attempted to predict the final toxicity experienced by the patients. The large number of variants present among the comparably few patients resulted in poor accuracy. The conclusion was drawn that the resolution of HiCap is too low compared to the density of variants in the non-coding regions. Additional data, such as transcription factor Chip-Seq data, and transcription factor motifs are needed to locate potentially contributing variants within the interactions. / Sedan den första sekvenseringen av det mänskliga genomet 2003 har vår bild av vårt genom och hur det regleras bara blivit mer komplex. Iden om att ha tillgång till ett helt genom skulle losa detta mysterium förkastades snabbt. Med de sjunkande kostnaderna for sekvensering har ett brett utbud av nya metoder utvecklats for att bättre förstå de icke-kodande regionernas roll i v art genom. Då dessa regioner utgör98% av vårt DNA ar innehåller de stor variation bland det mänskliga släktet, men att förutsaga deras effekt är mycket svårt. Många icke-kodande variationer har kopplats till komplexa sjukdomar så som ökad risk för cancer.Denna uppsats syftar till att undersoka de potentiella effekterna av icke-kodande varianter på hur allvarliga biverkningar en patient får av en cancerbehandling. Närmare undersöks två mediciners, Gemcitabins och Carboplatins effekt på 96 lungcancerpatienter. För detta används spatial data samt genuttrycksdata från blodcellinjer.Med utgångspunkt från genetiska varianter bland patienternas sekvenserade genom testades övervakad inlärning för att förutsäga graden av biverkningar hos patienterna. Den stora mängden varianter som bärs av de förhållandevis få patienterna resulterade i låg träffsäkerhet hos prediktorn. Slutsatsen drogs att upplösningen av HiCap är för låg i jämförelse med den höga densiteten av varianter i icke-kodanderegioner. Mer data, så som Chip-Seq data från transkriptionsfaktorer samt deras specifika bindningsekvenser behövs för att lokalisera varianter inom en interaktion, som potentiellt skulle kunna påverka biverkningarna.
|
7 |
Functional genomic analysis of cell cycle progression in human tissue culture cellsKittler, Ralf 19 October 2006 (has links) (PDF)
The eukaryotic cell cycle orchestrates the precise duplication and distribution of the genetic material, cytoplasm and membranes to daughter cells. In multicellular eukaryotes, cell cycle regulation also governs various organisatorial processes ranging from gametogenesis over multicellular development to tissue formation and repair. Consequently, defects in cell cycle regulation provoke a variety of human cancers. A global view of genes and pathways governing the human cell cycle would advance many research areas and may also deliver novel cancer targets. Therefore this work aimed on the genome-wide identification and systematic characterisation of genes required for cell cycle progression in human cells. I developed a highly specific and efficient RNA interference (RNAi) technology to realize the potential of RNAi for genome-wide screening of the genes essential for cell cycle progression in human tissue culture cells. This approach is based on the large-scale enzymatic digestion of long dsRNAs for the rapid and cost-efficient generation of libraries of highly complex pools of endoribonuclease-prepared siRNAs (esiRNAs). The analysis of the silencing efficiency and specificity of esiRNAs and siRNAs revealed that esiRNAs are as efficient for mRNA degradation as chemically synthesized siRNA designed with state-of-the-art design algorithms, while exhibiting a markedly reduced number of off-target effects. After demonstrating the effectiveness of this approach in a proof-of-concept study, I screened a genome-wide esiRNA library and used three assays to generate a quantitative and reproducible multi-parameter profile for the 1389 identified genes. The resulting phenotypic signatures were used to assign novel cell cycle functions to genes by combining hierarchical clustering, bioinformatics and proteomic data mining. This global perspective on gene functions in the human cell cycle presents a framework for the systematic documentation necessary for the understanding of cell cycle progression and its misregulation in diseases. The identification of novel genes with a role in human cell cycle progression is a starting point for an in-depth analysis of their specific functions, which requires the validation of the observed RNAi phenotype by genetic rescue, the study of the subcellular localisation and the identification of interaction partners of the expressed protein. One strategy to achieve these experimental goals is the expression of RNAi resistant and/or tagged transgenes. A major obstacle for transgenesis in mammalian tissue culture cells is the lack of efficient homologous recombination limiting the use of cultured mammalian cells as a real genetic system like yeast. I developed a technology circumventing this problem by expressing an orthologous gene from a closely related species including its regulatory sequences carried on a bacterial artificial chromosome (BAC). This technology allows physiological expression of the transgene, which cannot be achieved with conventional cDNA expression constructs. The use of the orthologous gene from a closely related species confers RNAi resistance to the transgene allowing the depletion of the endogenous gene by RNAi. Thus, this technology mimics homologous recombination by replacing an endogenous gene with a transgene while maintaining normal gene expression. In combination with recombineering strategies this technology is useful for RNAi rescue experiments, protein localisation and the identification of protein interaction partners in mammalian tissue culture cells. In summary, this thesis presents a major technical advance for large-scale functional genomic studies in mammalian tissue culture cells and provides novel insights into various aspects of cell cycle progression. (Die Druckexemplare enthalten jeweils eine CD-ROM als Anlagenteil: 217 MB: Movies, Rohdaten - Nutzung: Referat Informationsvermittlung der SLUB)
|
8 |
Assessment of complex microbial assemblages: description of their diversity and characterisation of individual membersMühling, Martin 01 February 2017 (has links) (PDF)
1. Microbial ecology
According to Caumette et al. (2015) the term ecology is derived from the Greek words “oikos” (the house and its operation) and “logos” (the word, knowledge or discourse) and can, therefore, be defined as the scientific field engaged in the “knowledge of the laws governing the house”. This, in extension, results in the simple conclusion that microbial ecology represents the study of the relationship between microorganisms, their co-occurring biota and the prevailing environmental conditions (Caumette et al. 2015).
The term microbial ecology has been in use since the early 1960s (Caumette et al. 2015) and microbial ecologists have made astonishing discoveries since. Microbial life at extremes such as in the hydrothermal vents (see Dubilier et al. 2008 and references therein) or the abundance of picophytoplankton (Waterbury et al. 1979; Chisholm et al. 1988) in the deep and surface waters of the oceans, respectively, are only a few of many highlights. Nevertheless, a microbial ecologist who, after leaving the field early in their career, now intends to return would hardly recognise again their former scientific field. The main reason for this hypothesis is to be found in the advances made to the methodologies employed in the field. Most of these were developed for biomedical research and were subsequently hijacked, sometimes followed by minor modifications, by microbial ecologists.
The Author presents in this thesis scientific findings which, although spanning only a fraction of the era of research into microbial ecology, have been obtained using various modern tools of the trade. These studies were undertaken by the Author during his employment as postdoctoral scientist at Warwick University (UK), as member of staff at Plymouth Marine Laboratory (UK) and as scientist at the TU Bergakademie Freiberg. Although the scientific issues and the environmental habitats investigated by the Author changed due to funding constraints or due to change of work place (i.e. from the marine to the mining environment) the research shared, by and large, a common aim: to further the existing understanding of microbial communities. The methodological approach chosen to achieve this aim employed both isolation followed by the characterisation of microorganisms and culture independent techniques. Both of these strategies utilised again a variety of methods, but techniques in molecular biology represent a common theme. In particular, the polymerase chain reaction (PCR) formed the work horse for much of the research since it has been routinely used for the amplification of a marker gene for strain identification or analysis of the microbial diversity. To achieve this, the amplicons were either directly sequenced by the Sanger approach or analysed via the application of genetic fingerprint techniques or through Sanger sequencing of individual amplicons cloned into a heterologous host. However, the Author did not remain at idle while with these ‘classical’ approaches for the analysis of microbial communities, but utilised the advances made in the development of nucleotide sequence analysis. In particular, the highly parallelised sequencing techniques (e.g. 454 pyrosequencing, Illumina sequencing) offered the chance to obtain both high genetic resolution of the microbial diversity present in a sample and identification of many individuals through sequence comparison with appropriate sequence repositories. Moreover, these next generation sequencing (NGS) techniques also provided a cost-effective opportunity to extent the characterisation of microbial strains to non-clonal cultures and to even complex microbial assemblages (metagenomics).
The work involving the high throughput sequencing techniques has been undertaken in collaboration with Dr Jack Gilbert (PML, lateron at Argonne National Laboratory, USA) and, since at Freiberg, with Dr Anja Poehlein (Goettingen University). These colleagues are thanked for their support with sequence data handling and analyses.
|
9 |
A recombineering pipeline for functional genomics applied to Caenorhabditis elegansSarov, Mihail 19 February 2007 (has links) (PDF)
Genome sequencing and annotation projects define the complete sets of RNA and protein components for living systems. They also present the challenge to generate functional information for thousands of previously uncharacterized genes. Protein tagging with fluorescent or affinity tags provides a generic way to describe protein expression and localization patterns and protein-protein interactions. The genome wide application of this approach in Saccharomyces cerevisiae has resulted in a comprehensive picture of the core proteome of a simple, well-studied model system. Extending these studies to more complex, multicellular model organisms, would allow us to place protein function onto a 4 dimensional space-time map, and will improve our understanding of the complex processes of development and differentiation. This will require efficient protein tagging methods and new high performance tags. Here we present a generic protein tagging approach for the model nematode Caenorhabditis elegans. The method is based on recombination mediated DNA engineering of genomic BAC clones into tagged transgenes for integrative transformation. C.elegans offers unique advantages for function discovery through protein tagging: compact and a well annotated genome, combined with a simple and well-understood anatomy and pattern of development. However, the methods for protein tagging in C.elegans have so far been inefficient and largely dependent on artificial cDNA based constructs, which can lack important regulatory elements. In contrast, our approach combines the advantages of authentic regulation with a new application of recombineering, which is simple, fast and efficient. For the first time we apply liquid culture cloning for multiple recombineering steps. This is particularly important when high throughput applications are considered, as it offers significant advantages in scale up and automation. We show that the BAC derived transgenes can be used for stable, integrative transformation in C. elegans. We show that the tagged transgene can take over the function of its endogenous counterpart. Using florescent reporter, we reproduce known and document new expression patterns. The second part of the thesis describes a project that we undertook to develop improved double affinity cassettes for protein purification. We evaluated the performance of 5 new double tag combinations in vitro and in mammalian culture cells. All of the new cassettes performed well and present a valuable tool for protein interaction studies in higher model systems.
|
10 |
Functional genomic analysis of cell cycle progression in human tissue culture cellsKittler, Ralf 18 October 2006 (has links)
The eukaryotic cell cycle orchestrates the precise duplication and distribution of the genetic material, cytoplasm and membranes to daughter cells. In multicellular eukaryotes, cell cycle regulation also governs various organisatorial processes ranging from gametogenesis over multicellular development to tissue formation and repair. Consequently, defects in cell cycle regulation provoke a variety of human cancers. A global view of genes and pathways governing the human cell cycle would advance many research areas and may also deliver novel cancer targets. Therefore this work aimed on the genome-wide identification and systematic characterisation of genes required for cell cycle progression in human cells. I developed a highly specific and efficient RNA interference (RNAi) technology to realize the potential of RNAi for genome-wide screening of the genes essential for cell cycle progression in human tissue culture cells. This approach is based on the large-scale enzymatic digestion of long dsRNAs for the rapid and cost-efficient generation of libraries of highly complex pools of endoribonuclease-prepared siRNAs (esiRNAs). The analysis of the silencing efficiency and specificity of esiRNAs and siRNAs revealed that esiRNAs are as efficient for mRNA degradation as chemically synthesized siRNA designed with state-of-the-art design algorithms, while exhibiting a markedly reduced number of off-target effects. After demonstrating the effectiveness of this approach in a proof-of-concept study, I screened a genome-wide esiRNA library and used three assays to generate a quantitative and reproducible multi-parameter profile for the 1389 identified genes. The resulting phenotypic signatures were used to assign novel cell cycle functions to genes by combining hierarchical clustering, bioinformatics and proteomic data mining. This global perspective on gene functions in the human cell cycle presents a framework for the systematic documentation necessary for the understanding of cell cycle progression and its misregulation in diseases. The identification of novel genes with a role in human cell cycle progression is a starting point for an in-depth analysis of their specific functions, which requires the validation of the observed RNAi phenotype by genetic rescue, the study of the subcellular localisation and the identification of interaction partners of the expressed protein. One strategy to achieve these experimental goals is the expression of RNAi resistant and/or tagged transgenes. A major obstacle for transgenesis in mammalian tissue culture cells is the lack of efficient homologous recombination limiting the use of cultured mammalian cells as a real genetic system like yeast. I developed a technology circumventing this problem by expressing an orthologous gene from a closely related species including its regulatory sequences carried on a bacterial artificial chromosome (BAC). This technology allows physiological expression of the transgene, which cannot be achieved with conventional cDNA expression constructs. The use of the orthologous gene from a closely related species confers RNAi resistance to the transgene allowing the depletion of the endogenous gene by RNAi. Thus, this technology mimics homologous recombination by replacing an endogenous gene with a transgene while maintaining normal gene expression. In combination with recombineering strategies this technology is useful for RNAi rescue experiments, protein localisation and the identification of protein interaction partners in mammalian tissue culture cells. In summary, this thesis presents a major technical advance for large-scale functional genomic studies in mammalian tissue culture cells and provides novel insights into various aspects of cell cycle progression. (Die Druckexemplare enthalten jeweils eine CD-ROM als Anlagenteil: 217 MB: Movies, Rohdaten - Nutzung: Referat Informationsvermittlung der SLUB)
|
Page generated in 0.1402 seconds