• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19
  • 5
  • 2
  • 1
  • Tagged with
  • 34
  • 34
  • 10
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Comparative genomics to investigate genome function and adaptations in the newly sequenced Brachyspira hyodysenteriae and Brachyspira pilosicoli

pwanch@msu.ac.th, Phatthanaphong Wanchanthuek January 2009 (has links)
Brachyspira hyodysenteriae and Brachyspira pilosicoli are anaerobic intestinal spirochaetes that are the aetiological agents of swine dysentery and intestinal spirochaetosis, respectively. As part of this PhD study the genome sequence of B. hyodysenteriae strain WA1 and a near complete sequence of B. pilosicoli strain 95/1000 were obtained, and subjected to comparative genomic analysis. The B. hyodysenteriae genome consisted of a circular 3.0 Mb chromosome, and a 35,940 bp circular plasmid that has not previously been described. The incomplete genome of B. pilosicoli contained 4 scaffolds. There were 2,652 and 2,297 predicted ORFs in the B. hyodysenteriae and B. pilosicoli strains, respectively. Of the predicted ORFs, more had similarities to proteins of the enteric Clostridium species than they did to proteins of other spirochaetes. Many of these genes were associated with transport and metabolism, and they may have been gradually acquired through horizontal gene transfer in the environment of the large intestine. A reconstruction of central metabolic pathways of the Brachyspira species identified a complete set of coding sequences for glycolysis, gluconeogenesis, a non-oxidative pentose phosphate pathway, nucleotide metabolism and a respiratory electron transport chain. A notable finding was the presence of rfb genes on the B. hyodysenteriae plasmid, and their apparent absence from B. pilosicoli. As these genes are involved in rhamnose biosynthesis it is likely that the composition of the B. hyodysenteriae lipooligosaccharide O-sugars is different from that of B. pilosicoli. O-antigen differences in these related species could be associated with differences in their specific niches, and/or with their disease specificity. Overall, comparison of B. hyodysenteriae and B. pilosicoli protein content and analysis of their central metabolic pathways showed that they have diverged markedly from other spirochaetes in the process of adapting to their habitat in the large intestine. The presence of overlapping genes in the two spirochaetes and in other spirochaete species also was investigated. The number of overlapping genes in the 12 spirochaete genomes examined ranged from 11-45%. Of these, 80% were unidirectional. Overlapping genes were found irregularly distributed within the Brachyspira genomes such that 70-80% of them occurred on the same strand (unidirectional, ->->/<-<-), with 16-28% occurring on opposite DNA strands (divergent, <-->). The remaining 4-6% of overlapping genes were convergent (-><-). The majority of the unidirectional overlap regions were relatively short, with >50% of the total observations overlapping by >4 bp. A small number of overlapping gene-pairs were duplicated within each genome and there were some triplet overlapping genes. Unique orthologous overlapping genes were identified within the various spirochaete genera. Over 75% of the overlapping genes in the Brachyspira species were in the same or related metabolic pathway. This finding suggests that overlapping genes are not only likely to be the result of functional constraints but also are constrained from a metabolomic context. Of the remaining 25% overlapping genes, 50% contained one hypothetical gene with unknown function. In addition, in one of the orthologous overlapping genes in the Brachyspira species, a promoter was shared, indicating the presence of a novel class of overlapping gene operon in these intestinal spirochaetes.
12

RNAs Everywhere: genome‐wide annotation of structured RNAs

Backofen, Rolf, Bernhart, Stephan H., Flamm, Christoph, Fried, Claudia, Fritzsch, Guido, Hackermüller, Jörg, Hertel, Jana, Hofacker, Ivo L., Missal, Kristin, Mosig, Axel, Prohaska, Sonja J., Rose, Dominic, Stadler, Peter F., Tanzer, Andrea, Washietl, Stefan, Will, Sebastian 09 November 2018 (has links)
Starting with the discovery of microRNAs and the advent of genome‐wide transcriptomics, non‐protein‐coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of “computational RNomics”, i.e., the bioinformatics approaches to genome‐wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non‐trivial examples for the discussion of some of the difficulties.
13

Non-coding RNA annotation of the genome of Trichoplax adhaerens

Hertel, Jana, de Jong, Danielle, Marz, Manja, Rose, Dominic, Tafer, Hakim, Tanzer, Andrea, Schierwater, Bernd, Stadler, Peter F. 04 February 2019 (has links)
A detailed annotation of non-protein coding RNAs is typically missing in initial releases of newly sequenced genomes. Here we report on a comprehensive ncRNA annotation of the genome of Trichoplax adhaerens, the presumably most basal metazoan whose genome has been published to-date. Since blast identified only a small fraction of the best-conserved ncRNAs—in particular rRNAs, tRNAs and some snRNAs—we developed a semi-global dynamic programming tool, GotohScan, to increase the sensitivity of the homology search. It successfully identified the full complement of major and minor spliceosomal snRNAs, the genes for RNase P and MRP RNAs, the SRP RNA, as well as several small nucleolar RNAs. We did not find any microRNA candidates homologous to known eumetazoan sequences. Interestingly, most ncRNAs, including the pol-III transcripts, appear as single-copy genes or with very small copy numbers in the Trichoplax genome.
14

Genomika a buněčná biologie oxymonád / Genomics and cell biology of oxymonads

Treitli, Sebastian Cristian January 2019 (has links)
Oxymonads are a group of poorly studied protists living as intestinal endosymbionts in the gut of insects and vertebrates. In this thesis I focused on the study of phylogeny, genomics and cell biology of oxymonads. Using culture-based approaches, we uncovered the hidden diversity of small oxymonads and described one new genus and six new species. In Monocercomonoides exilis, the only oxymonad with a published genome, we investigated the genome organization using fluorescence in situ hybdridization (FISH) against the telomeric regions and single-copy genes. Our results show that the genome is most probably haploid being organized in 6-7 chromosomes. Annotation of the genome revealed that the DNA replication and repair mechanisms in M. exilis are canonical and they seem more complete than those of other metamonads whose genomes are available. Although M. exilis lacks in any traces of mitochondria, its genome annotation revealed that other cellular systems do not markedly differ from other eukaryotes. Our taxon-rich phylogenetic analyses suggested that the genus Monocercomonoides is closely related to the oxymonad Streblomastix strix, which is found exclusively in the gut of the termites. Streblomastix strix, as opposed to M. exilis, is highly adapted to harbour bacterial ectosymbionts. Since S. strix...
15

Molecular Morphology: Phylogenetically Informative Characters Derived from Sequence Data

Donath, Alexander 07 July 2011 (has links)
A fundamental problem in biology is the reconstruction of the relatedness of all (extant) species. Traditionally, systematists employ visually recognizable characters of organisms for classification and evolutionary analysis. Recent developments in molecular and computational biology, however, lead to a whole different perspective on how to address the problem of inferring relatedness. The discovery of molecules, carrying genetic information, and the comparison of their primary structure has, in a rather short period of time, revolutionized our understanding of the phylogenetic relationship of many organisms. These novel approaches, however, turned out to bear similar problems as previous techniques. Moreover, they created new ones. Hence, taxonomists came to realize that even with this new type of data not all problematic relationships could be unambiguously resolved. The search for complementary approaches has led to the utilization of rare genomic changes and other characters which are largely independent from the primary structure of the underlying sequence(s). These “higher order” characters are thought to be evolutionary conserved in certain lineages and largely unaffected by primary sequence data-based problems, allowing for a better resolution of the Tree of Life. The central aim of this thesis is the utilization of molecular characters of higher order in connection with their consistent and comparable extraction from a given data set. Two novel methods are presented that allow such an inference. This is complemented with the search for and analysis of known and novel molecular characteristics to study the relationships among Metazoa, both intra- as well as interspecific. The first method tackles a common problem in phylogenetic analyses: the inference of reliable data set. As part of this thesis a pipeline was created for the automated annotation of metazoan mitochondrial genomes. Data thus obtained constitutes a reliable and standardized starting point for all downstream analyses, e.g. genome rearrangement studies. The second method utilizes a subclass of gaps, namely those which define an approximate split of a given data set. The definition and inference of such split-inducing indels (splids) is based on two basic principles. First, indels at the same position, i.e. sharing the same end points in two sequences, are likely homologous. Second, independent single-residue insertions and deletions tend to occur more frequently than multi-residue indels. It is shown that trees based on splids recover most of the undisputed monophyletic groups while influence of the underlying alignment algorithm is relatively small. Mitochondrial markers are a valuable tool for the understanding of small and large scale population structure. The non-coding control region of mitochondrial DNA (mtDNA) often contains a higher amount of variability compared to genes encoding proteins and non-coding RNAs. A case study on a small scale population structure investigates the control region of the European Fire-bellied Toad in order to find highly variable parts which are of potential importance to develop informative genetic markers. A particular focus is placed on the investigation of the evolutionary dynamics of the repetitive region at an inter- and intraspecific level. This includes understanding mechanisms underlying its evolution, i.e. by exploring the impact of secondary structure on slipped strand mispairing during mtDNA replication. The 7SK RNA is a key player in the regulation of polymerase II (Pol-II) transcription, interacting with at least three known proteins: It mediates the inhibition of the Positive Transcription Elongation Factor b (P-TEFb) by the HEXIM1/2 proteins, thereby repressing transcript elongation by Pol-II. A highly specific interaction with LARP7 (La-Related Protein 7), on the other hand, regulates its stability. 7SK RNA is capped at its 5’ end by a highly specific methyltransferase MePCE (Methylphosphate Capping Enzyme). Employing sequence and structure similarity it is shown that the 7SK RNA as well as its protein binding partners have a much earlier evolutionary origin than previously expected. Furthermore, this study presents a good illustration of the pitfalls of using markers of higher order for phylogenetic inference.
16

Evaluating <i>in silico</i> enhancer prediction for non-traditional model organisms through a cross species reporter assay

Tieke, Ellen Claire 19 April 2023 (has links)
No description available.
17

Improving algorithms of gene prediction in prokaryotic genomes, metagenomes, and eukaryotic transcriptomes

Tang, Shiyuyun 27 May 2016 (has links)
Next-generation sequencing has generated enormous amount of DNA and RNA sequences that potentially carry volumes of genetic information, e.g. protein-coding genes. The thesis is divided into three main parts describing i) GeneMarkS-2, ii) GeneMarkS-T, and iii) MetaGeneTack. In prokaryotic genomes, ab initio gene finders can predict genes with high accuracy. However, the error rate is not negligible and largely species-specific. Most errors in gene prediction are made in genes located in genomic regions with atypical GC composition, e.g. genes in pathogenicity islands. We describe a new algorithm GeneMarkS-2 that uses local GC-specific heuristic models for scoring individual ORFs in the first step of analysis. Predicted atypical genes are retained and serve as ‘external’ evidence in subsequent runs of self-training. GeneMarkS-2 also controls the quality of training process by effectively selecting optimal orders of the Markov chain models as well as duration parameters in the hidden semi-Markov model. GeneMarkS-2 has shown significantly improved accuracy compared with other state-of-the-art gene prediction tools. Massive parallel sequencing of RNA transcripts by the next generation technology (RNA-Seq) provides large amount of RNA reads that can be assembled to full transcriptome. We have developed a new tool, GeneMarkS-T, for ab initio identification of protein-coding regions in RNA transcripts. Unsupervised estimation of parameters of the algorithm makes unnecessary several steps in the conventional gene prediction protocols, most importantly the manually curated preparation of training sets. We have demonstrated that the GeneMarkS-T self-training is robust with respect to the presence of errors in assembled transcripts and the accuracy of GeneMarkS-T in identifying protein-coding regions and, particularly, in predicting gene starts compares favorably to other existing methods. Frameshift prediction (FS) is important for analysis and biological interpretation of metagenomic sequences. Reads in metagenomic samples are prone to sequencing errors. Insertion and deletion errors that change the coding frame impair the accurate identification of protein coding genes. Accurate frameshift prediction requires sufficient amount of data to estimate parameters of species-specific statistical models of protein-coding and non-coding regions. However, this data is not available; all we have is metagenomic sequences of unknown origin. The challenge of ab initio FS detection is, therefore, twofold: (i) to find a way to infer necessary model parameters and (ii) to identify positions of frameshifts (if any). We describe a new tool, MetaGeneTack, which uses a heuristic method to estimate parameters of sequence models used in the FS detection algorithm. It was shown on several test sets that the performance of MetaGeneTack FS detection is comparable or better than the one of earlier developed program FragGeneScan.
18

Décodage de l'expression de gènes cryptiques

Moreira, Sandrine 08 1900 (has links)
Pour certaines espèces, les nouvelles technologies de séquençage à haut débit et les pipelines automatiques d'annotation permettent actuellement de passer du tube Eppendorf au fichier genbank en un clic de souris, ou presque. D'autres organismes, en revanche, résistent farouchement au bio-informaticien le plus acharné en leur opposant une complexité génomique confondante. Les diplonémides en font partie. Ma thèse est centrée sur la découverte de nouvelles stratégies d'encryptage de l'information génétique chez ces eucaryotes, et l'identification des processus moléculaires de décodage. Les diplonémides sont des protistes marins qui prospèrent à travers tous les océans de la planète. Ils se distinguent par une diversité d'espèces riche et inattendue. Mais la caractéristique la plus fascinante de ce groupe est leur génome mitochondrial en morceaux dont les gènes sont encryptés. Ils sont décodés au niveau ARN par trois processus: (i) l'épissage en trans, (ii) l'édition par polyuridylation à la jonction des fragments de gènes, et (iii) l'édition par substitution de A-vers-I et C-vers-T; une diversité de processus posttranscriptionnels exceptionnelle dans les mitochondries. Par des méthodes bio-informatiques, j'ai reconstitué complètement le transcriptome mitochondrial à partir de données de séquences ARN à haut débit. Nous avons ainsi découvert six nouveaux gènes dont l'un présente des isoformes par épissage alternatif en trans, 216 positions éditées par polyuridylation sur 14 gènes (jusqu'à 29 uridines par position) et 114 positions éditées par déamination de A-vers-I et C-vers-T sur sept gènes (nad4, nad7, rns, y1, y2, y3, y5). Afin d'identifier les composants de la machinerie réalisant la maturation des ARNs mitochondriaux, le génome nucléaire a été séquencé, puis je l'ai assemblé et annoté. Cette machinerie est probablement singulière et complexe car aucun signal en cis ni acteur en trans caractéristiques des machineries d'épissage connues n'a été trouvé. J'ai identifié plusieurs candidats prometteurs qui devront être validés expérimentalement: des ARN ligases, un nombre important de protéines de la famille des PPR impliquées dans l'édition des ARNs dans les organites de plantes, ainsi que plusieurs déaminases. Durant ma thèse, nous avons mis en évidence de nouveaux types de maturation posttranscriptionnelle des ARNs dans la mitochondrie des diplonémides et identifié des candidats prometteurs de la machinerie. Ces composants, capables de lier précisément des fragments d'ARN et de les éditer pourraient trouver des applications biotechnologique. Au niveau évolutif, la caractérisation de nouvelles excentricités moléculaires de ce type nous donne une idée des processus de recrutement de gènes, de leur adaptation à de nouvelles fonctions, et de la mise en place de machineries moléculaires complexes. / Thanks to new high throughput sequencing technologies and automatic annotation pipelines, proceeding from an eppendorf tube to a genbank file can be achieved in a single mouse click or so, for some species. Others, however, fiercely resist bioinformaticians with their confounding genomic complexity. Diplonemids are one of them. My thesis is centered on the discovery of new strategies for encrypting genetic information in eukaryotes, and the identification of molecular decoding processes. Diplonemids are a group of poorly studied marine protists. Unexpectedly, metagenomic studies have recently ranked this group as one of the most diverse in the oceans. Yet, their most distinctive feature is their multipartite mitochondrial genome with genes in pieces, and encryption by nucleotide deletions and substitutions. Genes are decrypted at the RNA level through three processes: (i) trans-splicing, (ii) polyuridylation at the junction of gene pieces and (iii) substitutions of A-to-I and C-to-T. Such a diverse arsenal of mitochondrial post-transcriptional processes is highly exceptional. Using a bioinformatics approach, I have reconstructed the mitochondrial transcriptome from RNA-seq libraries. We have identified six new genes including one that presents alternative trans-splicing isoforms. In total, there are 216 uridines added in 14 genes with up to 29 U insertions, and 114 positions edited by deamination (A-to-I or C-to-T) among seven genes (nad4, nad7, rns, y1, y2, y3, y5). In order to identify the machinery that processes mitochondrial RNAs, the nuclear genome has been sequenced. I have then assembled and annotated the genome. This machinery is probably unique and complex because no cis signal or trans actor typical for known splicing machineries have been found. I have identified promising protein candidates that are worth to be tested experimentally, notably RNA ligases, numerous members of the PPR family involved in plants RNA editing and deaminases. During my thesis, we have identified new types of post-transcriptional RNA processing in diplonemid mitochondria and identified new promising candidates for the machinery. A system capable of joining precisely or editing RNAs could find biotechnological applications. From an evolutionary perspective, the discovery of new molecular systems gives insight into the process of gene recruitment, adaptation to new functions and establishment of complex molecular machineries.
19

Análise genômica de Streptomyces olindensis DAUFPE 5622 e de suas vias crípticas para a obtenção de novos metabólicos secundários de interesse biotecnológico. / Analysis of Streptomyces olindensis DAUFPE 5622 genome and its cryptic pathways to obtain new secondary metabolites of biotechnological interest.

Torres, Maria Alejandra Ferreira 08 December 2015 (has links)
Os compostos de origem microbiana tem readquirido interesse pela biodisponibilidade, especificidade de alvo e diversidade química, mas as vias biosintéticas permanecem crípticas em condições de cultura. Uma estratégia para expressa-las é a super-expressão de genes ativadores. O laboratório de Bio-Produtos no ICB na USP tem trabalhado com Streptomyces olindensis produtor da Cosmomicina D uma molécula com atividade antitumoral de interesse devido ao padrão de glicosilação. O genoma de S. olindensis foi sequenciado e submetido ao NCBI (JJOH00000000) e utilizando o software antiSMASH foram identificados 33 clusters envolvidos na produção de metabolitos secundários. Encontraram-se clusters gênicos para a produção de metabolitos como Melanina, Geosmina, entre outros. Além, foi realizada uma analise de genômica comparativa para caracterizar e anotar as 22 vias biossintéticas desconhecidas em S. olindensis. Finalmente, escolheram-se a via do aminociclitol e um Policetídeo Tipo I para a super-expressão de genes reguladores levando a detecção do composto sob condições de cultura. / Microbial metabolites regain interest due to its bioavailability, target specificity and chemical diversity, but the biosynthetic pathways remain silenced under culture conditions. A strategy to obtain them is the over expression of regulatory genes. Bio-products laboratory at USP has been working with Streptomyces olindensis, products of Cosmomycin D, an antitumoral molecule with a distinctive glycosylation pattern. S. olindensis genome was sequenced and submitted to NCBI (JJOH00000000) and employing antiSMASH server 33 secondary metabolite related clusters were identified. Known pathways were found such as genes for melanin production, Geosmin and others. Additionally, a comparative genomic approach was used to characterize the 22 biosynthetic unknown pathways described in S. olindensis. Subsequently, Aminocyclitol and Polyketide Type I were chosen to evaluated, over expressing the regulatory genes, leading to the compound detection in regular culture conditions.
20

Evolutionary Analysis of the Protein Domain Distribution in Eukaryotes

Parikesit, Arli Aditya 11 December 2012 (has links) (PDF)
Investigations into the origin and evolution of regulatory mechanisms require quantitative estimates of the abundance and co-occurrence of functional protein domains among distantly related genomes. The metabolic and regulatory capabilities of an organism are implicit in its protein content. Currently available methods suffer for strong ascertainment biases, requiring methods for unbiased approaches to protein domain contents at genome-wide scales. The discussion will be highlighted on large scale patterns of similarities and differences of domain contains between phylum-level or even higher level taxonomic groups. This provides insights into large-scale evolutionary trends. The complement of recognizable functional protein domains and their combinations convey essentially the same information and at the same time are much more readily accessible, although protein domain models trained for one phylogenetic group frequently fail on distantly related sequences. Transcription factors (TF) typically cooperate to activate or repress the expression of genes. They play a critical role in developmental processes. While Chromatin Regulation (CR) facilitates DNA organization and prevent DNA aggregation and tangling which is important for replication, segregation, and gene expression. To compare the set of TFs and CRs between species, the genome annotation of equal quality was employed. However, the existing annotation suffers from bias in model organism. The similar count of transcripts are expected to be similar in mammals, but model organism such as human has more annotated transcripts than non model such as gorilla. Moreover, closely related species (e.g, dolphin and human) show a dramatically different distribution of TFs and CRs. Within vertebrates, this is unreasonable and contradicts phylogenetic knowledge. To overcome this problem, performing gene prediction followed by the detection of functional domains via HMM-based annotation of SCOP domains were proposed. This methods was demonstrated to lead toward consistent estimates for quantitative comparison. To emphasize the applicability, the protein domain distribution of putative TFs and CRs by quantitative and boolean means were analyzed. In particular, systematic studies of protein domain occurrences and co-occurrences to study avoidance or preferential co-occurrence of certain protein domains within TFs and CRs were utilized. Pooling related domain models based on their GO-annotation in combination with de novo gene prediction methods provides estimates that seem to be less affected by phylogenetic biases. it was shown for 18 diverse representatives from all eukaryotic kingdoms that a pooled analysis of the tendencies for co-occurrence or avoidance of protein domains is indeed feasible. This type of analysis can reveal general large-scale patterns in the domain co-occurrence and helps to identify lineage-specific variations in the evolution of protein domains. Somewhat surprisingly, strong ubiquitous patterns governing the evolutionary behavior of specific functional classes were not found. Instead, there are strong variations between the major groups of Eukaryotes, pointing at systematic differences in their evolutionary constraints. Species-specific training is required, however, to account for the genomic peculiarities in many lineages. In contrast to earlier studies wide-spread statistically significant avoidance of protein domains associated with distinct functional high-level gene-ontology terms were found.

Page generated in 0.1253 seconds