Global ETD Search

21	Gene prediction in metagenomic sequencing reads / Genvorhersage in metagenomischen Sequenzier-Reads Hoff, Katharina Jasmin 08 October 2009 (has links) No description available. 570 Biowissenschaften, Biologie AHJ 300 WU 000 WF 610 Mathematics and Natural Science Metagenomik Genvorhersage Sequenzierfehler Metagenomics gene prediction sequencing errors 42.30 42.13 54.89
22	A Genome-Wide Association Study Suggests Novel Loci Associated with a Schizophrenia-Related Brain-Based Phenotype Hass, Johanna, Walton, Esther, Kirsten, Holger, Liu, Jingyu, Priebe, Lutz, Wolf, Christiane, Karbalai, Nazanin, Gollub, Randy, White, Tonya, Rößner, Veit, Müller, Kathrin U., Paus, Tomas, Smolka, Michael N., Schumann, Gunter, Scholz, Markus, Cichon, Sven, Calhoun, Vince, Ehrlich, Stefan 22 January 2014 (has links) (PDF) Patients with schizophrenia and their siblings typically show subtle changes of brain structures, such as a reduction of hippocampal volume. Hippocampal volume is heritable, may explain a variety of cognitive symptoms of schizophrenia and is thus considered an intermediate phenotype for this mental illness. The aim of our analyses was to identify single-nucleotide polymorphisms (SNP) related to hippocampal volume without making prior assumptions about possible candidate genes. In this study, we combined genetics, imaging and neuropsychological data obtained from the Mind Clinical Imaging Consortium study of schizophrenia (n = 328). A total of 743,591 SNPs were tested for association with hippocampal volume in a genome-wide association study. Gene expression profiles of human hippocampal tissue were investigated for gene regions of significantly associated SNPs. None of the genetic markers reached genome-wide significance. However, six highly correlated SNPs (rs4808611, rs35686037, rs12982178, rs1042178, rs10406920, rs8170) on chromosome 19p13.11, located within or in close proximity to the genes NR2F6, USHBP1, and BABAM1, as well as four SNPs in three other genomic regions (chromosome 1, 2 and 10) had p-values between 6.75×10−6 and 8.3×10−7. Using existing data of a very recently published GWAS of hippocampal volume and additional data of a multicentre study in a large cohort of adolescents of European ancestry, we found supporting evidence for our results. Furthermore, allelic differences in rs4808611 and rs8170 were highly associated with differential mRNA expression in the cis-acting region. Associations with memory functioning indicate a possible functional importance of the identified risk variants. Our findings provide new insights into the genetic architecture of a brain structure closely linked to schizophrenia. In silico replication, mRNA expression and cognitive data provide additional support for the relevance of our findings. Identification of causal variants and their functional effects may unveil yet unknown players in the neurodevelopment and the pathogenesis of neuropsychiatric disorders. Schizophrenie genomweite Assoziationsstudie TU Dresden Publikationsfonds Schizophrenia genome-wide association study Gene expression Gene prediction Genetics Hippocampus Memory Phenotypes Technical University Dresden Publication funds ddc:610 rvk:XA 10000
23	Unsupervised and semi-supervised training methods for eukaryotic gene prediction Ter-Hovhannisyan, Vardges 17 November 2008 (has links) This thesis describes new gene finding methods for eukaryotic gene prediction. The current methods for deriving model parameters for gene prediction algorithms are based on curated or experimentally validated set of genes or gene elements. These training sets often require time and additional expert efforts especially for the species that are in the initial stages of genome sequencing. Unsupervised training allows determination of model parameters from anonymous genomic sequence with. The importance and the practical applicability of the unsupervised training is critical for ever growing rate of eukaryotic genome sequencing. Three distinct training procedures are developed for diverse group of eukaryotic species. GeneMark-ES is developed for species with strong donor and acceptor site signals such as Arabidopsis thaliana, Caenorhabditis elegans and Drosophila melanogaster. The second version of the algorithm, GeneMark-ES-2, introduces enhanced intron model to better describe the gene structure of fungal species with posses with relatively weak donor and acceptor splice sites and well conserved branch point signal. GeneMark-LE, semi-supervised training approach is designed for eukaryotic species with small number of introns. The results indicate that the developed unsupervised training methods perform well as compared to other training methods and as estimated from the set of genes supported by EST-to-genome alignments. Analysis of novel genomes reveals interesting biological findings and show that several candidates of under-annotated and over-annotated fungal species are present in the current set of annotated of fungal genomes. Hidden markov models Self-training Gene annotation Genome annotation Viterbi algorithm Unsupervised training Gene prediction Gene finding Eukaryotic cells Genetics Algorithms
24	MYOP/ToPS/SGEval: Um ambiente computacional para estudo sistemático de predição de genes / MYOP/ToPS/SGEval: A computational framework for gene prediction André Yoshiaki Kashiwabara 10 February 2012 (has links) O desafio de encontrar corretamente genes eucarioticos codificadores de proteinas nas sequencias genomicas e um problema em aberto. Neste trabalho, implementamos uma plata- forma, com o objetivo de melhorar a forma com que preditores de genes sao implementados e avaliados. Tres novas ferramentas foram implementadas: ToPS (Toolkit of Probabilistic Models of Sequences) foi o primeiro arcabouco orientado a objetos que fornece ferramentas para implementacao, manipulacao, e combinacao de modelos probabilisticos para representar sequencias de simbolos; MYOP (Make Your Own Predictor) e um sistema que tem como objetivo facilitar a construcao de preditores de genes; e SGEval utiliza grafos de splicing para comparar diferente anotacoes com eventos de splicing alternativos. Utilizamos nossas ferramentas para o desenvolvimentos de preditores de genes em onze genomas distintos: A. thaliana, C. elegans, Z. mays, P. falciparum, D. melanogaster, D. rerio, M. musculus, R. norvegicus, O. sativa, G. max e H. sapiens. Com esse desenvolvimento, estabelecemos um protocolo para implementacao de novos preditores. Alem disso, utilizando a nossa plata- forma, desenvolvemos um fluxo de trabalho para predicao de genes no projeto do genoma da cana de acucar, que ja foi utilizado em 109 sequencias de BAC geradas pelo BIOEN (FAPESP Bioenergy Program). / The challenge of correctly identify eukaryotic protein-coding genes in the genomic se- quences is an open problem. In this work, we implemented a plataform with the aim of improving the way that gene predictors are implemented and evaluated. ToPS (Toolkit of Probabilistic Models of Sequence) was the first object-oriented framework that provides tools for implementation, manipulation, and combination of probabilistic models that represent sequences of symbols. MYOP (Make Your Own Predictor) facilitates the construction of gene predictors. SGEval (Splicing Graph Evaluation) uses splicing graphs to compare dif- ferent annotations with alternative splicing events. We used our plataform to develop gene finders in eleven distinct genomes: A. thaliana, C. elegans, Z. mays, P. falciparum, D. me- lanogaster, D. rerio, M. musculus, R. norvegicus, O. sativa, G. max e H. sapiens. With this development, we established a protocol for implementing new gene predictors. In addi- tion, using our platform, we developed a pipeline to find genes in the 109 sugarcane BAC sequences produced by BIOEN (FAPESP Bioenergy Program). Bioinformatica. cadeia de Markov oculta generalizada modelos probabilisticos predicao ab initio de genes ab initio gene prediction bioinformatics. generalized hidden Markov models probabilistic models
25	Genomic and proteomic analysis of drought tolerance in Sorghum (Sorghum bicolor (L.) Moench) Woldesemayat, Adunga,Abdi January 2014 (has links) Philosophiae Doctor - PhD / Drought is the most complex phenomenon that remained to be a potential and historic challenge to human welfare. It affects plant productivity by eliciting perturbations related to a pathway that controls a normal, functionally intact biological process of the plant. Sorghum (Sorghum bicolor (L.) Moench), a drought adapted model cereal grass is a potential target in the modem agricultural research towards understanding the molecular and cellular basis of drought tolerance. This study reports on the genomic and proteomic findings of drought tolerance in sorghum combining the results from in silica and experimental analysis. Pipeline that includes mapping expression data from 92 normalized cDNAs to genomic loci were used to identify drought tolerant genes. Integrative analysis was carried out using sequence similarity search, metabolic pathway, gene expression profiling and orthology relation to investigate genes of interest. Gene structure prediction was conducted using combination of ab initio and extrinsic evidence-driven information employing multi-criteria sources to improve accuracy. Gene ontology was used to cross-validate and to functionally assign and enrich genes. An integrated approach that subtly combines functional ontology based semantic data with expression profiling and biological networks was employed to analyse gene association with plant phenotypes and to identify and genetically dissect complex drought tolerance in sorghum. The gramene database was used to identify genes with direct or indirect association to drought related ontology terms in sorghum. Where direct association for sorghum genes were not available, genes were captured using Ensemble Biomart by transitive association based on the putative functions of sorghum orthologs in closely related species. Ontology mapping represented a direct or transitive association of genes to multiple drought related ontology terms based on sorghum specific genes or orthologs in related species. Correlation of genes to enriched gene ontology (GO)-terms (p-value < 0.05) related to the whole-plant structure was used to determine the extent of gene-phynotype association across-species and environmental stresses. Drought tolerance Gene-trait-association Novel gene prediction Differential expression MALDI- TOF- TOF/Mass-spectrometry Protein identification Proteornics Sorghum bicolor (L.) Moench Functional genomics
26	The Characterization and Utilization of Middle-range Sequence Patterns within the Human Genome Shepard, Samuel Steven 20 May 2010 (has links) No description available. Bioinformatics Markov gene prediction sequence classification abstraction bioinformatics genomics UTR untranslated region machine learning support vector machine homogeneous model non-randomness MRI mid-range inhomogeneity
27	Computational identification of genes: ab initio and comparative approaches Parra Farré, Genís 03 December 2004 (has links) El trabajo que aquí se presenta, estudia el reconocimiento de las señales que delimitan y definen los genes que codifican para proteínas, así como su aplicabilidad en los programas de predicción de genes. La tesis que aquí se presenta, también explora la utilitzación de la genómica comparativa para mejorar la identificación de genes en diferentes especies simultaniamente. También se explica el desarrollo de dos programas de predicción computacional de genes: geneid y sgp2. El programa geneid identifica los genes codificados en una secuencia anónima de DNA basandose en sus propiedades intrínsecas (principalmente las señales de splicing y el uso diferencial de codones). sgp2 permite utilitzar la comparación entre dos genomas, que han de estar a una cierta distancia evolutiva óptima, para mejorar la predicción de genes, bajo la hipotesis que las regiones codificantes están mas conservadas que las regiones que no codifican para proteínas. / The motivation of this thesis is to give a little insight in how genes are encoded and recognized by the cell machinery and to use this information to find genes in unannotated genomic sequences. One of the objectives is the development of tools to identify eukaryotic genes through the modeling and recognition of their intrinsic signals and properties. This thesis addresses another problem: how the sequence of related genomes can contribute to the identification of genes. The value of comparative genomics is illustrated by the sequencing of the mouse genome for the purpose of annotating the human genome. Comparative gene predictions programs exploit this data under the assumption that conserved regions between related species correspond to functional regions (coding genes among them). Thus, this thesis also describes a gene prediction program that combines ab initio gene prediction with comparative information between two genomes to improve the accuracy of the predictions. anotación de genomas gene prediction geneid sgp2 y modelos estadísticos bioinformatics genómica comparativa genome annotation comparative genomics spicing signals coding statistics geneid sgp2 and statistical models estadísticos codificantes bioinformática señales de splicing predicción de genes 575
28	Detektion funktioneller RNAs in Genomsequenzen / Detection of functional RNAs in genome sequences Heinemeyer, Isabelle 15 April 2009 (has links) No description available. 570 Biowissenschaften, Biologie Mathematics and Computer Science Bioinformatik funktionelle RNA Genvorhersage Bioinformatics functional RNA gene prediction 54.80 42.20 WD 500: Bioinformatik {Biologie} WJ 000: Genetik {Biologie} WJD 100: Gene {Biologie, Genetik}
29	A Genome-Wide Association Study Suggests Novel Loci Associated with a Schizophrenia-Related Brain-Based Phenotype Hass, Johanna, Walton, Esther, Kirsten, Holger, Liu, Jingyu, Priebe, Lutz, Wolf, Christiane, Karbalai, Nazanin, Gollub, Randy, White, Tonya, Rößner, Veit, Müller, Kathrin U., Paus, Tomas, Smolka, Michael N., Schumann, Gunter, Scholz, Markus, Cichon, Sven, Calhoun, Vince, Ehrlich, Stefan 22 January 2014 (has links) Patients with schizophrenia and their siblings typically show subtle changes of brain structures, such as a reduction of hippocampal volume. Hippocampal volume is heritable, may explain a variety of cognitive symptoms of schizophrenia and is thus considered an intermediate phenotype for this mental illness. The aim of our analyses was to identify single-nucleotide polymorphisms (SNP) related to hippocampal volume without making prior assumptions about possible candidate genes. In this study, we combined genetics, imaging and neuropsychological data obtained from the Mind Clinical Imaging Consortium study of schizophrenia (n = 328). A total of 743,591 SNPs were tested for association with hippocampal volume in a genome-wide association study. Gene expression profiles of human hippocampal tissue were investigated for gene regions of significantly associated SNPs. None of the genetic markers reached genome-wide significance. However, six highly correlated SNPs (rs4808611, rs35686037, rs12982178, rs1042178, rs10406920, rs8170) on chromosome 19p13.11, located within or in close proximity to the genes NR2F6, USHBP1, and BABAM1, as well as four SNPs in three other genomic regions (chromosome 1, 2 and 10) had p-values between 6.75×10−6 and 8.3×10−7. Using existing data of a very recently published GWAS of hippocampal volume and additional data of a multicentre study in a large cohort of adolescents of European ancestry, we found supporting evidence for our results. Furthermore, allelic differences in rs4808611 and rs8170 were highly associated with differential mRNA expression in the cis-acting region. Associations with memory functioning indicate a possible functional importance of the identified risk variants. Our findings provide new insights into the genetic architecture of a brain structure closely linked to schizophrenia. In silico replication, mRNA expression and cognitive data provide additional support for the relevance of our findings. Identification of causal variants and their functional effects may unveil yet unknown players in the neurodevelopment and the pathogenesis of neuropsychiatric disorders. info:eu-repo/classification/ddc/610 ddc:610
30	Comparative analysis of eukaryotic gene sequence features Abril Ferrando, Josep Francesc 17 May 2005 (has links) L'incessant augment del nombre de seqüències genòmiques, juntament amb l'increment del nombre de tècniques experimentals de les que es disposa, permetrà obtenir el catàleg complet de les funcions cel.lulars de diferents organismes, incloent-hi la nostra espècie. Aquest catàleg definirà els fonaments sobre els que es podrà entendre millor com els organismes funcionen a nivell molecular. Al mateix temps es tindran més pistes sobre els canvis que estan associats amb les malalties. Per tant, la seqüència en brut, tal i com s'obté dels projectes de seqüenciació de genomes, no té cap valor sense les anàlisis i la subsegüent anotació de les característiques que defineixen aquestes funcions. Aquesta tesi presenta la nostra contribució en tres aspectes relacionats de l'anotació dels gens en genomes eucariotes. Primer, la comparació a nivell de seqüència entre els genomes humà i de ratolí es va dur a terme mitjançant un protocol semi-automàtic. El programa de predicció de gens SGP2 es va desenvolupar a partir d'elements d'aquest protocol. El concepte al darrera de l'SGP2 és que les regions de similaritat obtingudes amb el programa TBLASTX, es fan servir per augmentar la puntuació dels exons predits pel programa geneid, amb el que s obtenen conjunts d'anotacions més acurats d'estructures gèniques. SGP2 té una especificitat que és prou gran com per que es puguin validar experimentalment via RT-PCR. La validació de llocs d'splicing emprant la tècnica de la RT-PCR és un bon exemple de com la combinació d'aproximacions computacionals i experimentals produeix millors resultats que per separat. S'ha dut a terme l'anàlisi descriptiva a nivell de seqüència dels llocs d'splicing obtinguts sobre un conjunt fiable de gens ortòlegs per humà, ratolí, rata i pollastre. S'han explorat les diferències a nivell de nucleòtid entre llocs U2 i U12, pel conjunt d'introns ortòlegs que se'n deriva d'aquests gens. S'ha trobat que els senyals d'splicing ortòlegs entre humà i rossegadors, així com entre rossegadors, estan més conservats que els llocs no relacionats. Aquesta conservació addicional pot ser explicada però a nivell de conservació basal dels introns. D'altra banda, s'ha detectat més conservació de l'esperada entre llocs d'splicing ortòlegs entre mamífers i pollastre. Els resultats obtinguts també indiquen que les classes intròniques U2 i U12 han evolucionat independentment des de l'ancestre comú dels mamífers i les aus. Tampoc s'ha trobat cap cas convincent d'interconversió entre aquestes dues classes en el conjunt d'introns ortòlegs generat, ni cap cas de substitució entre els subtipus AT-AC i GT-AG d'introns U12. Al contrari, el pas de GT-AG a GC-AG, i viceversa, en introns U2 no sembla ser inusual. Finalment, s'han implementat una sèrie d'eines de visualització per integrar anotacions obtingudes pels programes de predicció de gens i per les anàlisis comparatives sobre genomes. Una d'aquestes eines, el gff2ps, s'ha emprat en la cartografia dels genomes humà, de la mosca del vinagre i del mosquit de la malària, entre d'altres. El programa gff2aplot i els filtres associats, han facilitat la tasca d'integrar anotacions de seqüència amb els resultats d'eines per la cerca d'homologia, com ara el BLAST. S'ha adaptat també el concepte de pictograma a l'anàlisi comparativa de llocs d splicing ortòlegs, amb el desenvolupament del programa compi. / El aumento incesante del número de secuencias genómicas, junto con el incremento del número de técnicas experimentales de las que se dispone, permitirá la obtención del catálogo completo de las funciones celulares de los diferentes organismos, incluida nuestra especie. Este catálogo definirá las bases sobre las que se pueda entender mejor el funcionamiento de los organismos a nivel molecular. Al mismo tiempo, se obtendrán más pistas sobre los cambios asociados a enfermedades. Por tanto, la secuencia en bruto, tal y como se obtiene en los proyectos de secuenciación masiva, no tiene ningún valor sin los análisis y la posterior anotación de las características que definen estas funciones. Esta tesis presenta nuestra contribución a tres aspectos relacionados de la anotación de los genes en genomas eucariotas. Primero, la comparación a nivel de secuencia entre el genoma humano y el de ratón se llevó a cabo mediante un protocolo semi-automático. El programa de predicción de genes SGP2 se desarrolló a partir de elementos de dicho protocolo. El concepto sobre el que se fundamenta el SGP2 es que las regiones de similaridad obtenidas con el programa TBLASTX, se utilizan para aumentar la puntuación de los exones predichos por el programa geneid, con lo que se obtienen conjuntos más precisos de anotaciones de estructuras génicas. SGP2 tiene una especificidad suficiente como para validar esas anotaciones experimentalmente vía RT-PCR. La validación de los sitios de splicing mediante el uso de la técnica de la RT-PCR es un buen ejemplo de cómo la combinación de aproximaciones computacionales y experimentales produce mejores resultados que por separado. Se ha llevado a cabo el análisis descriptivo a nivel de secuencia de los sitios de splicing obtenidos sobre un conjunto fiable de genes ortólogos para humano, ratón, rata y pollo. Se han explorado las diferencias a nivel de nucleótido entre sitios U2 y U12 para el conjunto de intrones ortólogos derivado de esos genes. Se ha visto que las señales de splicing ortólogas entre humanos y roedores, así como entre roedores, están más conservadas que las no ortólogas. Esta conservación puede ser explicada en parte a nivel de conservación basal de los intrones. Por otro lado, se ha detectado mayor conservación de la esperada entre sitios de splicing ortólogos entre mamíferos y pollo. Los resultados obtenidos indican también que las clases intrónicas U2 y U12 han evolucionado independientemente desde el ancestro común de mamíferos y aves. Tampoco se ha hallado ningún caso convincente de interconversión entre estas dos clases en el conjunto de intrones ortólogos generado, ni ningún caso de substitución entre los subtipos AT-AC y GT-AG en intrones U12. Por el contrario, el paso de GT-AG a GC-AG, y viceversa, en intrones U2 no parece ser inusual. Finalmente, se han implementado una serie de herramientas de visualización para integrar anotaciones obtenidas por los programas de predicción de genes y por los análisis comparativos sobre genomas. Una de estas herramientas, gff2ps, se ha utilizado para cartografiar los genomas humano, de la mosca del vinagre y del mosquito de la malaria. El programa gff2aplot y los filtros asociados, han facilitado la tarea de integrar anotaciones a nivel de secuencia con los resultados obtenidos por herramientas de búsqueda de homología, como BLAST. Se ha adaptado también el concepto de pictograma al análisis comparativo de los sitios de splicing ortólogos, con el desarrollo del programa compi. / The constantly increasing amount of available genome sequences, along with an increasing number of experimental techniques, will help to produce the complete catalog of cellular functions for different organisms, including humans. Such a catalog will define the base from which we will better understand how organisms work at the molecular level. At the same time it will shed light on which changes are associated with disease. Therefore, the raw sequence from genome sequencing projects is worthless without the complete analysis and further annotation of the genomic features that define those functions. This dissertation presents our contribution to three related aspects of gene annotation on eukaryotic genomes. First, a comparison at sequence level of human and mouse genomes was performed by developing a semi-automatic analysis pipeline. The SGP2 gene-finding tool was developed from procedures used in this pipeline. The concept behind SGP2 is that similarity regions obtained by TBLASTX are used to increase the score of exons predicted by geneid, in order to produce a more accurate set of gene structures. SGP2 provides a specificity that is high enough for its predictions to be experimentally verified by RT-PCR. The RT-PCR validation of predicted splice junctions also serves as example of how combined computational and experimental approaches will yield the best results. Then, we performed a descriptive analysis at sequence level of the splice site signals from a reliable set of orthologous genes for human, mouse, rat and chicken. We have explored the differences at nucleotide sequence level between U2 and U12 for the set of orthologous introns derived from those genes. We found that orthologous splice signals between human and rodents and within rodents are more conserved than unrelated splice sites. However, additional conservation can be explained mostly by background intron conservation. Additional conservation over background is detectable in orthologous mammalian and chicken splice sites. Our results also indicate that the U2 and U12 intron classes have evolved independently since the split of mammals and birds. We found neither convincing case of interconversion between these two classes in our sets of orthologous introns, nor any single case of switching between AT-AC and GT-AG subtypes within U12 introns. In contrast, switching between GT-AG and GC-AG U2 subtypes does not appear to be unusual. Finally, we implemented visualization tools to integrate annotation features for gene- finding and comparative analyses. One of those tools, gff2ps, was used to draw the whole genome maps for human, fruitfly and mosquito. gff2aplot and the accompanying parsers facilitate the task of integrating sequence annotations with the output of homologybased tools, like BLAST.We have also adapted the concept of pictograms to the comparative analysis of orthologous splice sites, by developing compi. amino acid sequences eukaryotic cells cèl·lules seqüències dels aminoàcids genòmica chicken gallus gallus rattus norvegicus rat mus musculus mouse gene prediction RT-PCR validation SGP2 evaluation geneid comparative computational gene finding anopheles gambiae genome map drosophila melanogaster fruitfly mosquito human compi gff2aplot gff2ps feature visualization U12 genome annotation U2 splice sites exonic gene structure genomics bioinformatics 575

Search results