• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 195
  • 91
  • 42
  • 28
  • 8
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 445
  • 445
  • 80
  • 79
  • 76
  • 71
  • 67
  • 53
  • 48
  • 45
  • 37
  • 36
  • 33
  • 32
  • 32
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
401

Análise do transcriptoma e trocas gasosas em plantas de arroz sob estresses abióticos / Transcriptome analysis and gas exchange in rice plants under abiotic stresses

Amaral, Marcelo Nogueira do 04 March 2015 (has links)
Submitted by Maria Beatriz Vieira (mbeatriz.vieira@gmail.com) on 2017-06-22T12:29:24Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) dissertacao_marcelo_nogueira_do_amaral.pdf: 2321866 bytes, checksum: 813c57b5a855e4f2470433f907cc1a61 (MD5) / Approved for entry into archive by Aline Batista (alinehb.ufpel@gmail.com) on 2017-06-22T20:37:11Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) dissertacao_marcelo_nogueira_do_amaral.pdf: 2321866 bytes, checksum: 813c57b5a855e4f2470433f907cc1a61 (MD5) / Made available in DSpace on 2017-06-22T20:37:11Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) dissertacao_marcelo_nogueira_do_amaral.pdf: 2321866 bytes, checksum: 813c57b5a855e4f2470433f907cc1a61 (MD5) Previous issue date: 2015-03-04 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / O arroz (Oryza sativa L.) é o segundo cereal mais cultivado no mundo, sendo o Brasil o 9º maior produtor. Apesar dos bons níveis de produtividade, acredita-se que estes números sejam insuficientes para suprir o requerimento populacional no mundo, sendo necessário um aumento na produção. Entretanto, diversos estresses abióticos, como salinidade, toxidez por ferro e baixa temperatura, limitam a produtividade do arroz mundialmente. A resposta das plantas às condições estressantes é um fenômeno complexo, envolvendo alterações morfológicas, fisiológicas, bioquímicas e moleculares. Assim, o conhecimento do transcriptoma e análise fotossintética de plantas de arroz submetidas a estes estresses, pode auxiliar a elucidar quais vias metabólicas são alteradas e quais são as principais respostas bioquímicas e fisiológicas das plantas em tais condições. Desta forma, o objetivo desse trabalho foi analisar os genes diferencialmente expressos (DEGs), através da metodologia de RNA-Seq e quantificar as trocas gasosas em folhas de arroz (cv. BRS Querência), no estádio V3, submetidas aos estresses por frio, ferro e sal durante o período de 24 horas. Um intervalo entre 41 - 51 milhões de reads foram submetidas ao alinhamento, sendo que deste total um intervalo entre 88,47 - 89,21% foram mapeadas no genoma de referência. Foi observado 5.506 genes diferencialmente expressos para o estresse por frio, 1.808 para o sal e 630 para o ferro, sendo que 330 DEGs foram comuns aos três estresses. A anotação funcional através do software MapMan demonstrou que o estresse por frio promoveu maiores alterações no metabolismo em geral. O estresse salino apresentou uma rede de interação de termos de Ontologia Gênica (GOs) sobre-representados mais complexa que os demais estresses. Nos parâmetros de trocas gasosas, a taxa assimilatória liquida foi a única que apresentou diferença significativa, com o estresse por frio apresentando a menor média. A partir dos resultados obtidos foi possível concluir que o estresse por baixa temperatura apresenta um maior número de genes diferencialmente expressos e que há uma maior relação entre o estresse salino e por ferro. Além disso, o estresse por frio é o que afeta mais drasticamente a fotossíntese, tanto em nível molecular quanto fisiológico, e apesar de reduções na taxa assimilatória liquida, a cultivar BRS Querência demonstrou menores alterações nos estresses salino e por excesso de ferro. / Rice (Oryza sativa L.) is the second most cultivated cereal in the world; Brazil is the ninth biggest producer. Despite the high levels of productivity, it is believed that these numbers are insufficient to meet the population requirement in the world, demanding an increase in its production. However, various abiotic stresses, such as salinity, toxicity by iron and low temperature, limit the productivity of rice worldwide. The response of plants to stressful conditions is a complex phenomenon involving morphological, physiological, biochemical and molecular changes. Thus, the knowledge of the transcriptome and photosynthetic analysis of rice plants subjected to these stresses can help to elucidate which metabolic pathways are changed and what are the main biochemical and physiological responses of plants under such conditions. Thus, the objective of this study was to analyze the differentially expressed genes (DEGs), through the methodology of RNA-Seq and quantify gas exchanges in rice leaves (cv. BRS Querência) in V3 stage, under cold, iron and salt stress during 24 hours. A range from 41 to 51 million reads was submitted to alignment, in which a range from 88.47 to 89.21% has been mapped in the reference genome. 5,506 differentially expressed genes under cold stress were observed, as well as, 1,808 for salt and 630 for iron stress; 330 of them were similar to the three DEGs stresses. Functional annotation by MapMan software showed that, usually, cold stress promoted major changes in the metabolism. The saline stress presented a network of interaction of Gene Ontology terms (GOs) over-represented are more complex than the other stresses. In the parameters of gas exchange, the net assimilation rate was the only one significant difference, with stress by cold presenting the lowest average. From the results obtained it was possible to conclude that the low temperature stress has a greater number of differentially expressed genes and that there is a greater relationship between salt and iron stress. In addition, the cold stress is affecting more drastically photosynthesis, either in molecular-level and physiological, and despite reductions in net assimilation rate, BRS Querência showed minor changes in the salt stress and excess iron.
402

De novo algorithms to identify patterns associated with biological events in de Bruijn graphs built from NGS data / Algorithmes de novo pour l'identification de motifs associés à des événements biologiques dans les graphes de De Bruijn construits à partir de données NGS

Ishi Soares de Lima, Leandro 23 April 2019 (has links)
L'objectif principal de cette thèse est le développement, l'amélioration et l'évaluation de méthodes de traitement de données massives de séquençage, principalement des lectures de séquençage d'ARN courtes et longues, pour éventuellement aider la communauté à répondre à certaines questions biologiques, en particulier dans les contextes de transcriptomique et d'épissage alternatif. Notre objectif initial était de développer des méthodes pour traiter les données d'ARN-seq de deuxième génération à l'aide de graphes de De Bruijn afin de contribuer à la littérature sur l'épissage alternatif, qui a été exploré dans les trois premiers travaux. Le premier article (Chapitre 3, article [77]) a exploré le problème que les répétitions apportent aux assembleurs de transcriptome si elles ne sont pas correctement traitées. Nous avons montré que la sensibilité et la précision de notre assembleur local d'épissage alternatif augmentaient considérablement lorsque les répétitions étaient formellement modélisées. Le second (Chapitre 4, article [11]) montre que l'annotation d'événements d'épissage alternatifs avec une seule approche conduit à rater un grand nombre de candidats, dont beaucoup sont importants. Ainsi, afin d'explorer de manière exhaustive les événements d'épissage alternatifs dans un échantillon, nous préconisons l'utilisation combinée des approches mapping-first et assembly-first. Étant donné que nous avons une énorme quantité de bulles dans les graphes de De Bruijn construits à partir de données réelles d'ARN-seq, qui est impossible à analyser dans la pratique, dans le troisième travail (Chapitre 5, articles [1, 2]), nous avons exploré théoriquement la manière de représenter efficacement et de manière compacte l'espace des bulles via un générateur des bulles. L'exploration et l'analyse des bulles dans le générateur sont réalisables dans la pratique et peuvent être complémentaires aux algorithmes de l'état de l'art qui analysent un sous-ensemble de l'espace des bulles. Les collaborations et les avancées sur la technologie de séquençage nous ont incités à travailler dans d'autres sous-domaines de la bioinformatique, tels que: études d'association à l'échelle des génomes, correction d'erreur et assemblage hybride. Notre quatrième travail (Chapitre 6, article [48]) décrit une méthode efficace pour trouver et interpréter des unitigs fortement associées à un phénotype, en particulier la résistance aux antibiotiques, ce qui rend les études d'association à l'échelle des génomes plus accessibles aux panels bactériens, surtout ceux qui contiennent des bactéries plastiques. Dans notre cinquième travail (Chapitre 7, article [76]), nous évaluons dans quelle mesure les méthodes existantes de correction d'erreur ADN à lecture longue sont capables de corriger les lectures longues d'ARN-seq à taux d'erreur élevé. Nous concluons qu'aucun outil ne surpasse tous les autres pour tous les indicateurs et est le mieux adapté à toutes les situations, et que le choix devrait être guidé par l'analyse en aval. Les lectures longues d'ARN-seq fournissent une nouvelle perspective sur la manière d'analyser les données transcriptomiques, puisqu'elles sont capables de décrire les séquences complètes des ARN messagers, ce qui n'était pas possible avec des lectures courtes dans plusieurs cas, même en utilisant des assembleurs de transcriptome de l'état de l'art. En tant que tel, dans notre dernier travail (Chapitre 8, article [75]), nous explorons une méthode hybride d'assemblage d'épissages alternatifs qui utilise des lectures à la fois courtes et longues afin de répertorier les événements d'épissage alternatifs de manière complète, grâce aux lectures courtes, guidé par le contexte intégral fourni par les lectures longues / The main goal of this thesis is the development, improvement and evaluation of methods to process massively sequenced data, mainly short and long RNA-sequencing reads, to eventually help the community to answer some biological questions, especially in the transcriptomic and alternative splicing contexts. Our initial objective was to develop methods to process second-generation RNA-seq data through de Bruijn graphs to contribute to the literature of alternative splicing, which was explored in the first three works. The first paper (Chapter 3, paper [77]) explored the issue that repeats bring to transcriptome assemblers if not addressed properly. We showed that the sensitivity and the precision of our local alternative splicing assembler increased significantly when repeats were formally modeled. The second (Chapter 4, paper [11]), shows that annotating alternative splicing events with a single approach leads to missing out a large number of candidates, many of which are significant. Thus, to comprehensively explore the alternative splicing events in a sample, we advocate for the combined use of both mapping-first and assembly-first approaches. Given that we have a huge amount of bubbles in de Bruijn graphs built from real RNA-seq data, which are unfeasible to be analysed in practice, in the third work (Chapter 5, papers [1, 2]), we explored theoretically how to efficiently and compactly represent the bubble space through a bubble generator. Exploring and analysing the bubbles in the generator is feasible in practice and can be complementary to state-of-the-art algorithms that analyse a subset of the bubble space. Collaborations and advances on the sequencing technology encouraged us to work in other subareas of bioinformatics, such as: genome-wide association studies, error correction, and hybrid assembly. Our fourth work (Chapter 6, paper [48]) describes an efficient method to find and interpret unitigs highly associated to a phenotype, especially antibiotic resistance, making genome-wide association studies more amenable to bacterial panels, especially plastic ones. In our fifth work (Chapter 7, paper [76]), we evaluate the extent to which existing long-read DNA error correction methods are capable of correcting high-error-rate RNA-seq long reads. We conclude that no tool outperforms all the others across all metrics and is the most suited in all situations, and that the choice should be guided by the downstream analysis. RNA-seq long reads provide a new perspective on how to analyse transcriptomic data, since they are able to describe the full-length sequences of mRNAs, which was not possible with short reads in several cases, even by using state-of-the-art transcriptome assemblers. As such, in our last work (Chapter 8, paper [75]) we explore a hybrid alternative splicing assembly method, which makes use of both short and long reads, in order to list alternative splicing events in a comprehensive manner, thanks to short reads, guided by the full-length context provided by the long reads
403

Resistance mechanisms to Didymascella thujina (Durand) Maire in Thuja plicata Donn ex D. Don, Thuja standishii (Gord.) Carrière and Thuja standishii x plicata

Aldana, Juan Andres 11 September 2018 (has links)
Plants and microorganisms interact with each other constantly, with some interactions being mutually beneficial and others being detrimental to the plants. The features of the organisms involved in such interactions will determine the characteristics of individual pathosystems. Plants respond readily to pathogen attacks, regardless of the pathosystem; furthermore, variation in the resistance to pathogens within species is common and well documented in many plant species. The variability in pathogen resistance is at the core of genetic improvement programs for disease resistance. True resistance to pathogens in plants is a genetically determined and complex trait that can involve both constitutive and induced mechanisms at different levels of organization. The complexity of this phenomenon makes the study of compatible plant - pathogen interactions challenging, and typically, disease resistance studies focus on specific aspects of a pathosystem, such as field resistance, anatomical or physiological features of resistant plants, or molecular mechanisms of resistance. The Thuja sp. - Didymascella thujina (E.J. Durand) Maire interaction is an important pathosystem in western North America, which has been studied for more than five decades. Western redcedar (Thuja plicata Donn ex D. Don) is very susceptible to cedar leaf blight (D. thujina), a biotroph that affects the tree at all stages, although seedlings are the most sensitive to the pathogen. The characteristics of the Thuja sp. - D. thujina interaction, the wealth of information on the pathosystem and the excellent Thuja sp. genetic resources available from the British Columbia Ministry of Forests, Lands, Natural Resource Operations and Rural Development make this interaction an ideal system to advance the study of disease resistance mechanisms in conifers. This Doctoral project presents a comprehensive investigation of the constitutive and induced resistance mechanisms against D. thujina in T. plicata, Thuja standishii (Gord.) Carrière and a Thuja standishii x plicata hybrid at the phenotypic and gene expression levels, undertaken with the objective of exploring the resistance mechanisms against the biotroph in these conifers. The project also aimed to establish base knowledge for the future development of markers for marker-assisted breeding of T. plicata. The investigations included a combination of histological, chemical and next generation sequencing (NGS) methodologies. NGS data were analyzed, in addition to the traditional clustering analyses, with cutting edge machine learning methods, including grade of membership analysis, dynamic topic modelling and stability selection analysis. The studies were progressively more controlled to narrow the focus on the resistance mechanisms to D. thujina in Thuja sp. Histological characteristics related to D. thujina resistance in Thuja sp. were studied first, along with the relationship between climate of origin and disease resistance. The virulence of D. thujina was also documented early in this project. Chemical and gene expression constitutive and induced responses to D. thujina infection in T. plicata seedlings were studied next. T. plicata clonal lines were then comprehensively studied to shed light on the mechanisms behind known physiologically determined resistance. A holistic investigation of the resistance mechanisms to D. thujina in T. standishii, T. plicata and a T. standishii x plicata hybrid explored the possibility of a gene-for-gene resistance model. Thirty-five T. plicata families were screened during the four field seasons carried out between 2012 and 2015, totalling more than 1,400 seedlings scored for D. thujina severity. Thirteen of those families were used in the five studies performed during the program, along with two T. plicata seedling lines self-pollinated for five generations and three T. plicata clonal lines. One T. standishii clonal line, and one T. standishii x plicata clone were also investigated during the program. A total of 16 histological and anatomical characteristics were studied in more than 750 samples, and more than 270 foliar samples were analyzed for 60 chemical and nutritional compounds. Almost one million transcriptomic sequences in four individually assembled reference transcriptomes were examined during the program. The results of the project support the variability in the resistance to D. thujina in T. plicata, as well as the higher resistance to the pathogen in plants originating from cooler and wetter environments. The data collected also depicted the existence of age-related resistance in T. plicata, and confirmed the full resistance to the disease in T. standishii. Western redcedar plants resistant and susceptible to D. thujina showed constitutive differences at the phenotypic and gene expression levels. Resistant T. plicata seedlings had thicker cuticles, constitutively higher concentrations of sabinene, alpha-thujene, and higher levels of expression of NBS-LRR disease resistance proteins. Resistant clones of T. plicata and T. standishii had higher expression levels of bark storage proteins and of dirigent proteins. Plants from all ages, species and resistance classes studied that were infected with D. thujina showed the accumulation of aluminum in the foliage, and increased levels of sequences involved in cell wall reinforcement. Additional responses to D. thujina infection in T. plicata seedlings included the downregulation of some secondary metabolic pathways, whereas pathogenesis-related proteins were upregulated in clonal lines of T. plicata. The comprehensive approach used here to study the Thuja sp. - D. thujina pathosystem could be applied to other compatible plant-pathogen interactions. / Graduate / 2020-08-31
404

Pathway-centric approaches to the analysis of high-throughput genomics data

Hänzelmann, Sonja, 1981- 11 October 2012 (has links)
In the last decade, molecular biology has expanded from a reductionist view to a systems-wide view that tries to unravel the complex interactions of cellular components. Owing to the emergence of high-throughput technology it is now possible to interrogate entire genomes at an unprecedented resolution. The dimension and unstructured nature of these data made it evident that new methodologies and tools are needed to turn data into biological knowledge. To contribute to this challenge we exploited the wealth of publicly available high-throughput genomics data and developed bioinformatics methodologies focused on extracting information at the pathway rather than the single gene level. First, we developed Gene Set Variation Analysis (GSVA), a method that facilitates the organization and condensation of gene expression profiles into gene sets. GSVA enables pathway-centric downstream analyses of microarray and RNA-seq gene expression data. The method estimates sample-wise pathway variation over a population and allows for the integration of heterogeneous biological data sources with pathway-level expression measurements. To illustrate the features of GSVA, we applied it to several use-cases employing different data types and addressing biological questions. GSVA is made available as an R package within the Bioconductor project. Secondly, we developed a pathway-centric genome-based strategy to reposition drugs in type 2 diabetes (T2D). This strategy consists of two steps, first a regulatory network is constructed that is used to identify disease driving modules and then these modules are searched for compounds that might target them. Our strategy is motivated by the observation that disease genes tend to group together in the same neighborhood forming disease modules and that multiple genes might have to be targeted simultaneously to attain an effect on the pathophenotype. To find potential compounds, we used compound exposed genomics data deposited in public databases. We collected about 20,000 samples that have been exposed to about 1,800 compounds. Gene expression can be seen as an intermediate phenotype reflecting underlying dysregulatory pathways in a disease. Hence, genes contained in the disease modules that elicit similar transcriptional responses upon compound exposure are assumed to have a potential therapeutic effect. We applied the strategy to gene expression data of human islets from diabetic and healthy individuals and identified four potential compounds, methimazole, pantoprazole, bitter orange extract and torcetrapib that might have a positive effect on insulin secretion. This is the first time a regulatory network of human islets has been used to reposition compounds for T2D. In conclusion, this thesis contributes with two pathway-centric approaches to important bioinformatic problems, such as the assessment of biological function and in silico drug repositioning. These contributions demonstrate the central role of pathway-based analyses in interpreting high-throughput genomics data. / En l'última dècada, la biologia molecular ha evolucionat des d'una perspectiva reduccionista cap a una perspectiva a nivell de sistemes que intenta desxifrar les complexes interaccions entre els components cel•lulars. Amb l'aparició de les tecnologies d'alt rendiment actualment és possible interrogar genomes sencers amb una resolució sense precedents. La dimensió i la naturalesa desestructurada d'aquestes dades ha posat de manifest la necessitat de desenvolupar noves eines i metodologies per a convertir aquestes dades en coneixement biològic. Per contribuir a aquest repte hem explotat l'abundància de dades genòmiques procedents d'instruments d'alt rendiment i disponibles públicament, i hem desenvolupat mètodes bioinformàtics focalitzats en l'extracció d'informació a nivell de via molecular en comptes de fer-ho al nivell individual de cada gen. En primer lloc, hem desenvolupat GSVA (Gene Set Variation Analysis), un mètode que facilita l'organització i la condensació de perfils d'expressió dels gens en conjunts. GSVA possibilita anàlisis posteriors en termes de vies moleculars amb dades d'expressió gènica provinents de microarrays i RNA-seq. Aquest mètode estima la variació de les vies moleculars a través d'una població de mostres i permet la integració de fonts heterogènies de dades biològiques amb mesures d'expressió a nivell de via molecular. Per il•lustrar les característiques de GSVA, l'hem aplicat a diversos casos usant diferents tipus de dades i adreçant qüestions biològiques. GSVA està disponible com a paquet de programari lliure per R dins el projecte Bioconductor. En segon lloc, hem desenvolupat una estratègia centrada en vies moleculars basada en el genoma per reposicionar fàrmacs per la diabetis tipus 2 (T2D). Aquesta estratègia consisteix en dues fases: primer es construeix una xarxa reguladora que s'utilitza per identificar mòduls de regulació gènica que condueixen a la malaltia; després, a partir d'aquests mòduls es busquen compostos que els podrien afectar. La nostra estratègia ve motivada per l'observació que els gens que provoquen una malaltia tendeixen a agrupar-se, formant mòduls patogènics, i pel fet que podria caldre una actuació simultània sobre múltiples gens per assolir un efecte en el fenotipus de la malaltia. Per trobar compostos potencials, hem usat dades genòmiques exposades a compostos dipositades en bases de dades públiques. Hem recollit unes 20.000 mostres que han estat exposades a uns 1.800 compostos. L'expressió gènica es pot interpretar com un fenotip intermedi que reflecteix les vies moleculars desregulades subjacents a una malaltia. Per tant, considerem que els gens d'un mòdul patològic que responen, a nivell transcripcional, d'una manera similar a l'exposició del medicament tenen potencialment un efecte terapèutic. Hem aplicat aquesta estratègia a dades d'expressió gènica en illots pancreàtics humans corresponents a individus sans i diabètics, i hem identificat quatre compostos potencials (methimazole, pantoprazole, extracte de taronja amarga i torcetrapib) que podrien tenir un efecte positiu sobre la secreció de la insulina. Aquest és el primer cop que una xarxa reguladora d'illots pancreàtics humans s'ha utilitzat per reposicionar compostos per a T2D. En conclusió, aquesta tesi aporta dos enfocaments diferents en termes de vies moleculars a problemes bioinformàtics importants, com ho son el contrast de la funció biològica i el reposicionament de fàrmacs "in silico". Aquestes contribucions demostren el paper central de les anàlisis basades en vies moleculars a l'hora d'interpretar dades genòmiques procedents d'instruments d'alt rendiment.
405

Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data

Le, Hai-Son Phuoc 01 May 2013 (has links)
Advances in genomics allow researchers to measure the complete set of transcripts in cells. These transcripts include messenger RNAs (which encode for proteins) and microRNAs, short RNAs that play an important regulatory role in cellular networks. While this data is a great resource for reconstructing the activity of networks in cells, it also presents several computational challenges. These challenges include the data collection stage which often results in incomplete and noisy measurement, developing methods to integrate several experiments within and across species, and designing methods that can use this data to map the interactions and networks that are activated in specific conditions. Novel and efficient algorithms are required to successfully address these challenges. In this thesis, we present probabilistic models to address the set of challenges associated with expression data. First, we present a novel probabilistic error correction method for RNA-Seq reads. RNA-Seq generates large and comprehensive datasets that have revolutionized our ability to accurately recover the set of transcripts in cells. However, sequencing reads inevitably contain errors, which affect all downstream analyses. To address these problems, we develop an efficient hidden Markov modelbased error correction method for RNA-Seq data . Second, for the analysis of expression data across species, we develop clustering and distance function learning methods for querying large expression databases. The methods use a Dirichlet Process Mixture Model with latent matchings and infer soft assignments between genes in two species to allow comparison and clustering across species. Third, we introduce new probabilistic models to integrate expression and interaction data in order to predict targets and networks regulated by microRNAs. Combined, the methods developed in this thesis provide a solution to the pipeline of expression analysis used by experimentalists when performing expression experiments.
406

Efficient algorithms for de novo assembly of alternative splicing events from RNA-seq data

Tominaga Sacomoto, Gustavo Akio 06 March 2014 (has links) (PDF)
In this thesis, we address the problem of identifying and quantifying variants (alternative splicing and genomic polymorphism) in RNA-seq data when no reference genome is available, without assembling the full transcripts. Based on the idea that each variant corresponds to a recognizable pattern, a bubble, in a de Bruijn graph constructed from the RNA-seq reads, we propose a general model for all variants in such graphs. We then introduce an exact method, called KisSplice, to extract alternative splicing events and show that it outperforms general purpose transcriptome assemblers. We put an extra effort to make KisSplice as scalable as possible. In order to improve the running time, we propose a new polynomial delay algorithm to enumerate bubbles. We show that it is several orders of magnitude faster than previous approaches. In order to reduce its memory consumption, we propose a new compact way to build and represent a de Bruijn graph. We show that our approach uses 30% to 40% less memory than the state of the art, with an insignificant impact on the construction time. Additionally, we apply the techniques developed to list bubbles in two classical problems: cycle enumeration and the K-shortest paths problem. We give the first optimal algorithm to list cycles in undirected graphs, improving over Johnson's algorithm. This is the first improvement to this problem in almost 40 years. We then consider a different parameterization of the K-shortest (simple) paths problem: instead of bounding the number of st-paths, we bound the weight of the st-paths. We present new algorithms using exponentially less memory than previous approaches
407

Regulation of gene expression in the dinoflagellate Lingulodinium polyedrum

Roy, Sougata 07 1900 (has links)
Les dinoflagellés sont des eucaryotes unicellulaires que l’on retrouve autant en eau douce qu’en milieu marin. Ils sont particulièrement connus pour causer des fleurs d’algues toxiques nommées ‘marée-rouge’, ainsi que pour leur symbiose avec les coraux et pour leur importante contribution à la fixation du carbone dans les océans. Au point de vue moléculaire, ils sont aussi connus pour leur caractéristiques nucléaires uniques, car on retrouve généralement une quantité immense d’ADN dans leurs chromosomes et ceux-ci sont empaquetés et condensés sous une forme cristalline liquide au lieu de nucléosomes. Les gènes encodés par le noyau sont souvent présents en multiples copies et arrangés en tandem et aucun élément de régulation transcriptionnelle, y compris la boite TATA, n’a encore été observé. L’organisation unique de la chromatine des dinoflagellés suggère que différentes stratégies sont nécessaires pour contrôler l’expression des gènes de ces organismes. Dans cette étude, j’ai abordé ce problème en utilisant le dinoflagellé photosynthétique Lingulodinium polyedrum comme modèle. L. polyedrum est d’un intérêt particulier, car il a plusieurs rythmes circadiens (journalier). À ce jour, toutes les études sur l’expression des gènes lors des changements circadiens ont démontrées une régulation à un niveau traductionnel. Pour mes recherches, j’ai utilisé les approches transcriptomique, protéomique et phosphoprotéomique ainsi que des études biochimiques pour donner un aperçu de la mécanique de la régulation des gènes des dinoflagellés, ceci en mettant l’accent sur l’importance de la phosphorylation du système circadien de L. polyedrum. L’absence des protéines histones et des nucléosomes est une particularité des dinoflagellés. En utilisant la technologie RNA-Seq, j’ai trouvé des séquences complètes encodant des histones et des enzymes modifiant les histones. L polyedrum exprime donc des séquences conservées codantes pour les histones, mais le niveau d’expression protéique est plus faible que les limites de détection par immunodétection de type Western. Les données de séquençage RNA-Seq ont également été utilisées pour générer un transcriptome, qui est une liste des gènes exprimés par L. polyedrum. Une recherche par homologie de séquences a d’abord été effectuée pour classifier les transcrits en diverses catégories (Gene Ontology; GO). Cette analyse a révélé une faible abondance des facteurs de transcription et une surprenante prédominance, parmi ceux-ci, des séquences à domaine Cold Shock. Chez L. polyedrum, plusieurs gènes sont répétés en tandem. Un alignement des séquences obtenues par RNA-Seq avec les copies génomiques de gènes organisés en tandem a été réalisé pour examiner la présence de transcrits polycistroniques, une hypothèse formulée pour expliquer le manque d’élément promoteur dans la région intergénique de la séquence de ces gènes. Cette analyse a également démontré une très haute conservation des séquences codantes des gènes organisés en tandem. Le transcriptome a également été utilisé pour aider à l’identification de protéines après leur séquençage par spectrométrie de masse, et une fraction enrichie en phosphoprotéines a été déterminée comme particulièrement bien adapté aux approches d’analyse à haut débit. La comparaison des phosphoprotéomes provenant de deux périodes différentes de la journée a révélée qu’une grande partie des protéines pour lesquelles l’état de phosphorylation varie avec le temps est reliées aux catégories de liaison à l’ARN et de la traduction. Le transcriptome a aussi été utilisé pour définir le spectre des kinases présentes chez L. polyedrum, qui a ensuite été utilisé pour classifier les différents peptides phosphorylés qui sont potentiellement les cibles de ces kinases. Plusieurs peptides identifiés comme étant phosphorylés par la Casein Kinase 2 (CK2), une kinase connue pour être impliquée dans l’horloge circadienne des eucaryotes, proviennent de diverses protéines de liaison à l’ARN. Pour évaluer la possibilité que quelques-unes des multiples protéines à domaine Cold Shock identifiées dans le transcriptome puissent moduler l’expression des gènes de L. polyedrum, tel qu’observé chez plusieurs autres systèmes procaryotiques et eucaryotiques, la réponse des cellules à des températures froides a été examinée. Les températures froides ont permis d’induire rapidement un enkystement, condition dans laquelle ces cellules deviennent métaboliquement inactives afin de résister aux conditions environnementales défavorables. Les changements dans le profil des phosphoprotéines seraient le facteur majeur causant la formation de kystes. Les phosphosites prédits pour être phosphorylés par la CK2 sont la classe la plus fortement réduite dans les kystes, une découverte intéressante, car le rythme de la bioluminescence confirme que l’horloge a été arrêtée dans le kyste. / Dinoflagellates are unicellular eukaryotes found in both marine and freshwater environments. They are best known for causing toxic blooms called ‘red-tides’, for their symbiosis with corals, and for their important contribution to carbon fixation in the ocean. On a more molecular level, they are also known for their unique nuclear characteristics, as they generally have huge amount of DNA found in chromosomes that are permanently condensed and packaged into liquid crystalline forms instead of nucleosomes. Nuclear-encoded genes are often present in multiple copies and arranged in tandem, and no putative promoter elements including the conserved TATA box, have yet been observed. The unique organization of dinoflagellate chromatin suggests different strategies may be required to regulate gene expression in these organisms. In this study, I have started to address this problem using the photosynthetic dinoflagellate Lingulodinium polyedrum as a model. L. polyedrum is of particular interest because it shows a number of circadian (daily) rhythms. To date, all circadian changes in gene expression studied are regulated at a translational level. I have used transcriptomic, proteomic and phosphoproteomic approaches along with biochemical studies to provide insight into the gene regulatory mechanisms in dinoflagellates, with particular emphasis on the importance of phosphorylation in the L. polyedrum circadian system. The absence of histone proteins and nucleosomes is a hallmark of the dinoflagellates. Using high throughput RNA-seq technology, I found complete set of sequences encoding the core histones as well as sequences encoding histone-modifying enzymes in L. polyedrum. Thus L. polyedrum expresses conserved histone transcripts, although levels of proteins are still below what can be detected using immunoblotting studies. Using the de novo assembly algorithm the RNA-seq data was used to generate a transcriptome. This transcriptome, a list of genes expressed by L. polyedrum, has been extensively characterized. First, homology based sequence searches were used to classify the transcripts in gene ontology (GO) categories, and this analysis revealed a reduced number of transcription factor types and a surprising predominance of sequences containing a cold shock domain. Alignments of reads from the RNA–seq to genomic copies of L. polyedrum tandem repeat sequences was performed to assess the possibility of polycistronic transcripts, a hypothesis proposed to explain the lack of promoter elements in the intergenic region of the tandem repeat gene sequences. This analysis also showed a surprisingly high conservation of tandemly repeated gene sequences. The transcriptome database was also used to fuel gene identification after protein sequencing by mass spectrometry, and a purified phosphoproteome fraction was found to be particularly amenable to high throughput approaches. A comparison of the phosphoproteome at two different times of day revealed that a major class of proteins whose phosphorylation state varied over time belonged to the RNA binding and translation GO category. The transcriptome was also used to define the spectrum of kinases present in L. polyedrum, which in turn was used to classify the different phosphorylated peptides as potential kinase targets. Predicted peptides of casein kinase 2 (CK2), a kinase known to be involved in the circadian clocks of other eukaryotes, were found to include many RNA binding proteins. To assess the possibility that some of the many cold shock domain proteins identified in the transcriptome might modulate gene expression in L. polyedrum, as has been observed in many other eukaryotic and prokaryotic systems, the cellular response to cold temperatures was examined. Cold temperatures were found to induce rapid encystment, a metabolically inactive cell type whose role is to combat unfavourable environmental conditions. Changes in phosphoproteome profile were found to be the major molecular correlates to cyst formation. Predicted CK2 phosphosites are the most highly reduced class of kinase targets, a finding of interest as measurements of the bioluminescence rhythm confirmed that the clock is stopped in cysts
408

RNA-Seq and proteomics based analysis of regulatory RNA features and gene expression in Bacillus licheniformis

Wiegand, Sandra 25 September 2013 (has links)
No description available.
409

Hormetic dietary phytochemicals from Western Canadian plants: Identification, characterization and mechanistic insights

2013 June 1900 (has links)
Activation of mammalian stress responsive pathways by plant secondary metabolites may contribute to the protection against certain chronic diseases afforded by fruit and vegetable consumption. This work focuses on the identification of plant compounds that activate the stress-responsive enzyme quinone reductase (QR) by stabilizing the transcription factor NF-E2 related factor-2 (Nrf2). Screening methanolic extracts of plants from Western Canada for QR induction in a mouse hepatoma cell line (Hepa-1c1c7) led to the identification of twenty-one extracts capable of doubling the activity of QR. Bioassay-guided fractionation of six extracts led to the identification of novel classes of compounds with QR-inducing activity including fatty-acid derived polyacetylenes, phthalides, and cannabinoids. Studies using low molecular weight thiols and the recombinantly expressed protein Keap1, the principal negative regulator of Nrf2, supported a mechanism of QR activation involving covalent modification of Keap1 cysteines for the polyacetylenes and phthalides. Analysis of transcriptional changes in response to treatment with a panel of QR-inducing compounds provided strong support for Nrf2 activation by the polyacetylene (3S,8S)-falcarindiol and the isothiocyanate (R)-sulforaphane and weaker support for the compounds (3R,8S)-falcarindiol, 6-isovaleryl-umbelliferone (6-IVU) and (Z)-ligustilide. Additionally, transcript level analyses supported a role for the aryl-hydrocarbon receptor in QR-activation by (3R,8S)-falcarindiol, (Z)-ligustilide, (R)-sulforaphane, 6-IVU and cannabidiol and suggested that treatment with polyacetylenes with a (3R)-configuration, (Z)-ligustilide and 6-IVU causes substantial changes in the expression of genes associated with lipid homeostasis and energy metabolism. As a whole, this work provides evidence that compounds that activate QR (and Nrf2) are widely distributed in the Canadian flora. However, of these QR activators, few are active at concentrations that are expected to be achieved through dietary consumption. Nevertheless, the most exceptional compounds isolated in this work, the compounds (3S,8S)-falcarindiol and epoxyfalcarindiol are highly potent and appear to be or are expected to be specific for activating Nrf2 and thus warrant attention with respect to dietary implications and as drug candidate leads.
410

Variations structurales du génome et du transcriptome humains induites par les rétrotransposons LINE-1 / Structural variations of the human genome and transcriptome induced by LINE-1 retrotransposons

Mir, Ashfaq Ali 04 December 2015 (has links)
Les rétrotransposons sont des éléments génétiques mobiles qui constituent presque la moitié de notre génome. Seule la sous-famille L1HS appartenant à la classe des Long Interspersed Element-1(LINE-1 ou L1) a gardé une capacité de mobilité autonome chez l’Homme. Leur mobilisation dans la lignée germinale, mais Aussi dans certains tissus somatiques, contribue à la diversité du génome humain ainsi qu’à certaines maladies comme le cancer. Ainsi, de nouvelles copies de L1 peuvent directement s'intégrer dans des séquences codantes ou régulatrices, et altérer leur fonction. De plus, les séquences L1 contiennent elles-mêmes plusieurs éléments cis-régulateurs et leur insertion à proximité ou dans un gène peut produire des altérations génétiques plus subtiles. Afin d'explorer l'ensemble de ces altérations à l'échelle du génome, nous avons développé un logiciel dédié à l’analyse des données de séquençage d'ARN qui permet d'identifier des transcrits chimériques ou antisens impliquant les L1 et d'annoter ces isoformes en fonction des différents événements d’épissage alternatif subits. Au cours de ce travail, il est apparu que la compréhension du lien entre polymorphisme des insertions et phénotype nécessite une vue complète des différentes copies L1HS présentes chez un individu donné. Afin de disposer d'un catalogue aussi complet que possible de ces polymorphismes identifiés dans des échantillons humains sains ou pathologiques et publiés dans des journaux scientifiques, nous avons développé euL1db, la base de données des insertions de rétrotransposon L1HS chez l’Homme. En conclusion, ce travail aidera à comprendre l’impact des L1 sur l’expression des gènes, à l'échelle du génome. / Retrotransposons are mobile genetics elements, which form almost half of our genome. Only the L1HS subfamily of the Long Interspersed Element-1 class (LINE-1 or L1) has retained the ability to jump autonomously in humans. Their mobilization in the germline – but also in some somatic tissues – contributes to human genetic diversity and to diseases, such as cancer. L1 reactivation can be directly mutagenic by disrupting genes or regulatory sequences. In addition, L1 sequences themselves contain many regulatory cis-elements. Thus, L1 insertions near a gene or within intronic sequences can also produce more subtle genic alterations. To explore L1-mediated genic alterations in a genome-wide manner, we have developed a dedicated RNA-seq analysis software able to identify L1 chimeric or antisense transcripts and to annotate these novel isoforms with their associated alternative splicing events. During the course of this work, it appeared that understanding the link between L1HS insertion polymorphisms and phenotype or disease requires a comprehensive view of the different L1HS copies present in a given individual or sample. To provide a comprehensive summary of L1HS insertion polymorphisms identified in healthy or pathological human samples and published in peer-reviewed journals, we developed euL1db, the European database of L1HS retrotransposon insertions in humans. This work will help understanding the overall impact of L1 insertions on gene expression, at a genome-wide scale.

Page generated in 0.058 seconds