• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 30
  • 9
  • 7
  • 2
  • 1
  • Tagged with
  • 51
  • 51
  • 23
  • 18
  • 14
  • 12
  • 12
  • 10
  • 8
  • 7
  • 7
  • 7
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Comprehensive analysis of sugarcane (Saccharum spp) gene expression changes in response to drought and re-watering conditions / Análise global das mudanças na expressão gênica em cana-de-açúcar (Saccharum spp) em resposta às condições de seca e reidratação

Silva, Danielle Izilda Rodrigues da 15 December 2017 (has links)
The exhaustion of oil fields together with the undesirable effects of its use has turned sugarcane into an attractive crop for the biofuel market, increasing its economic and environmental importance. The position of Brazil as the world\'s major sugarcane producer and the need to expand the planted area to soil with less favorable conditions makes the study of drought, one of the abiotic stresses affecting the most of this crop yield, essential for the future of Brazil as the main exporter of this commodity. This work has the aim of providing a comprehensive analysis of sugarcane drought responses in the physiological and molecular levels. In order to do that we followed four strategies. First, we performed the analysis of physiology and transcriptome (microarray) of drought stressed sugarcane plants in three time points (4 days of stress, 6 days of stress and re-watering) of a greenhouse experiment. The plant material analyzed was leaves and roots. Second, aiming to identify different genes and new patterns of expression it was done the analysis of RNA-Seq from the most discrepant condition, from both leaves and roots, found by the microarray, third, we performed the analysis of a drought progression experiment through physiology and qRT-PCR of selected candidate genes and forth we built co-expression networks to detect interesting patterns. Physiology analysis showed that plants were under moderate to severe water stress with decreases of up to 97% in photosynthesis. Microarray data indentified 7,867 unique SAS with a fold change of more than 2 or less that 0.5, and 575 unique SAS differentially expressed. The analysis of the identified sequences allowed the observation that in leaves after 4 days of stress, the plant is mostly transducing the signal from the environment, while after 6 days and after rehydration there is a more functional response of the plant, with re-watering leading the metabolism back to homeostase. In the case of roots, it was observed a similar response, however roots take longer to go back to the initial condition, since several genes are still being down-regulated even after re-watering. There are also pathways presenting an opposite pattern in the analyzed tissues, being activated in one tissue but repressed in the other, such as Phenylpropanoid Biosynthesis pathway. Furthermore, while in leaves there is a restriction on photosynthesis, on roots it seems to be a restriction on growth. RNA-Seq de novo assembly showed 28,240 differentially expressed features in leaves and 7,435 in roots, while using the reference genome (unpublished data) it was possible to identify 38,317 differentially expressed genes in leaves and 7,649 in roots, and the analysis of KEGG pathways indicate that ABA has a major role in both leaves and roots responses to drought, but in leaves there is an interplay of phytohormones. Drought progression experiment confirms the results obtained from microarray and shows that when stress is extreme, gene expression starts to decrease, suggesting the plant might be entering in senescence. Co-expression analysis allowed the determination of three modules correlated with physiological parameters altered during water stress, and lead to the identification of some possible hub genes that may be important for sugarcane responses to drought. Furthermore, it was possible to identify genes that through both co-expression and qRT-PCR analysis had similar patterns of expression. Altogether, these results give us a comprehensive view of the alterations in sugarcane responses to water stress and helped us gain insight for defining better suited candidate genes for plant breeding. / A exaustão dos combustíveis fósseis juntamente com os efeitos não desejáveis de seu uso, tonaram a cana-de-açúcar uma cultura atrativa para o mercado de biocombustíveis, aumentando a sua importância econômica e ambiental. A posição do Brasil como o principal produtor de cana-de-açúcar e a necessidade de expandir a área plantada para regiões com condições menos favoráveis, tornam o estudo da seca, um dos principais estresses abióticos que afetam a produtividade da cultura, essencial para o futuro do Brasil como o principal exportador dessa comoditie. Este trabalho tem o objetivo de fornecer uma análise global das respostas da cana-de-açúcar à seca, tanto em nível fisiológico quanto molecular. Para isso, foram seguidas quatro estratégias. Primeiro foi realizada uma análise da fisiologia e do transcriptoma (microarranjo) de plantas de cana-de-açúcar cultivadas em casa de vegetação e estressadas por três períodos diferentes (4 dias de estresse, 6 dias de estresse e reidratação). Os tecidos analisados foram folha e raiz. Segundo, com o objetivo de identificar diferentes genes e novos padrões de expressão, foi realizada a análise de RNA-Seq em tecidos de folha e raiz utilizando a condição mais discrepante identificada pelo microarray; terceiro, foi feita a análise de um experimento de progressão da seca por meio da fisiologia e qRT-PCR usando genes candidatos selecionados. A quarta estratégia foi a construção de redes de co-expressão objetivando detectar módulos de genes relacionados à resposta a seca. As análises de fisiologia mostraram que as plantas estavam sob estresse moderado a severo com diminuição de até 97% na fotossíntese. Os dados de microarray levaram à identificação de 7.867 SAS únicos com diferença de razão de expressão maior que 2 ou menor que 0,5, e 575 SAS únicos diferencialmente expressos. A análise das sequencias identificadas permitiu a observação de que em folhas, depois de 4 dias de estresse, há basicamente a transdução dos sinais obtidos a partir do ambiente, enquanto depois de 6 dias e após a reidratação há uma resposta mais funcional da planta, com a última conduzindo o metabolismo de volta à homeostase. No caso das raízes foi observado uma resposta similar, porém, as raízes demoram mais tempo para voltar à condição inicial, de forma diversos genes continuam reprimidos mesmo após a reidratação. Há ainda rotas metabólicas, como o Biosíntese de Fenilpropanoides, que apresentam perfis opostos nos tecidos analisados, sendo ativada em um e reprimida no outro. Além disso, enquanto em folhas há uma restrição na fotossíntese, em raízes parece existir uma restrição no crescimento. A análise de novo do RNA-Seq mostrou 28.240 \"features\" diferencialmente expressos em folhas e 7.435 em raízes, enquanto a utilização do genoma de referência (dados não publicados) identificou 38.317 genes diferencialmente expressos em folha e 7.649 em raiz, sendo que a análise das rotas do KEGG indicam que o ABA tem um papel principal nas respostas à seca em ambos os tecidos, no entanto em folhas existe uma interação entre fitohormônios. O experimento de progressão da seca confirma os resultados obtidos a partir do microarranjo e mostram que quando o estresse é severo, a expressão gênica começa à diminuir, sugerindo que a planta pode estar entrando em senescência. As análises de coexpressão permitiram a determinação de três módulos correlacionados com parâmetros de fisiologia alterados durante o estresse hídrico, e conduziram à identificação de alguns genes centrais que podem ser importantes para as respostas da cana à seca. Além disso, foi possível identificar genes que tanto pela análise de co-expressão quanto pelo qRT-PCR apresentam padrões similar de expressão. Juntos, esses resultados forneceram uma visão global das alterações que ocorrem na cana-de-açúcar em resposta ao estresse hídrico e ajudaram a obter conhecimento para seleção de genes candidatos adequados para o melhoramento genético de plantas.
22

Integrative analysis of microRNAs and mRNAs involved in regulation of intramuscular fat deposition in Nelore cattle / Análise de integração de dados de microRNAs e mRNAs envolvidos na regulação da deposição de gordura intramuscular em bovinos Nelore

Gabriella Borba de Oliveira 16 February 2017 (has links)
The amount of intramuscular fat can influence the sensory characteristics and nutritional value of beef, thus the selection of animals with adequate fat content for consumer becomes important. Intramuscular fat is a complex trait that is difficult to measure and there is growing knowledge about the genes and pathways that control the biological processes involved in fat deposition in muscle. MicroRNAs (miRNAs) are well conserved class of non-coding small RNAs that modulate gene expression of a range of functions in animal development and physiology. This study aimed to identify differentially expressed (DE) miRNAs, regulatory candidate genes and co-expression networks using mRNAs and miRNAs expression data from the Longissimus dorsi muscle of 30 Nelore steers with extreme genomic estimated breeding values (GEBV) for intramuscular fat (IMF) content. The differential expression analysis between the miRNA data from animals with extreme GEBV values for IMF identified six DE miRNAs. Functional annotation of target genes for these microRNAs indicates that PPARs signaling pathway is involved with IMF deposition. Regulatory candidate genes such as SDHAF4, FBXO17, ALDOA and PKM were identified by partial correlation with information theory (PCIT), phenotypic impact factor (PIF) and regulatory impact factor (RIF) approaches from integrated miRNAs-mRNAs expression data. Two DE miRNAs, bta-miR-143 and bta-miR-146b, upregulated in Low IMF group, were also correlated with regulatory candidate genes, which were functionally enriched for GO terms for fatty acids oxidation. Co-expression networks identified several modules related to immune system, protein metabolism, energy metabolism and glucose catabolism by weighted correlation network analysis (WGCNA), which showed possible interaction and regulation between mRNAs and miRNAs. This study contributes to our understanding of regulatory mechanisms of gene signaling networks involved in fat deposition process. Glucose metabolism and inflammation process were the main pathways found in integrative mRNAs-miRNAs analysis and showed to influence intramuscular fat content in beef cattle. / A quantidade de gordura intramuscular pode influenciar as características sensoriais e o valor nutricional da carne bovina, assim, a seleção de animais com conteúdo de gordura adequado para o consumidor torna-se importante. A gordura intramuscular é uma característica complexa, de difícil medição e há um conhecimento crescente sobre os genes e vias que controlam os processos biológicos envolvidos na deposição de gordura no músculo. MicroRNAs (miRNAs) são uma classe bem conservados de pequenos RNAs não-codificantes, que modulam a expressão gênica de uma gama de funções no desenvolvimento e fisiologia animal. Este estudo objetivou identificar miRNAs diferencialmente expressos (DE), genes reguladores candidatos e redes de co-expressão usando dados de expressão de mRNAs e miRNAs do músculo Longissimus dorsi de 30 novilhos Nelore com valores genéticos genômicos estimados (GEBV) extremos para conteúdo de gordura intramuscular (IMF). A análise de expressão diferencial entre os dados de miRNA de animais com valores extremos de GEBV para o IMF identificou seis miRNAs DE. A anotação funcional de genes alvos destes microRNAs indica que a via de sinalização de PPAR está envolvida com a deposição de IMF. Os genes reguladores candidatos, tais como SDHAF4, FBXO17, ALDOA e PKM foram identificados pelas abordagens de correlação parcial com teoria da informação (PCIT), fator de impacto fenotípico (PIF) e fator de impacto regulatório (RIF) a partir de dados integrados de expressão de mRNAs-miRNAs. Dois miRNAs, bta-miR-143 e bta-miR-146b, com alta expressão no grupo de baixo conteúdo de IMF, também foram correlacionados com genes reguladores candidatos, os quais foram funcionalmente enriquecidos para termos GO relacionados a oxidação de ácidos graxos. As redes de co-expressão identificaram vários módulos relacionados ao sistema imunológico, ao metabolismo das proteínas, ao metabolismo energético e ao catabolismo da glicose através da análise ponderada da rede de correlação (WGCNA), que mostrou possível interação e regulação entre mRNAs e miRNAs. Este estudo contribui com a compreensão dos possíveis mecanismos reguladores das redes de sinalização genética envolvidas no processo de deposição de gordura. O metabolismo da glicose e o processo de inflamação foram as principais vias encontrados na análise integrada de mRNA-miRNA e mostraram estar associadas ao conteúdo de gordura intramuscular em bovinos de corte.
23

An approach for analyzing and classifying microarray data using gene co-expression networks cycles / Uma abordagem para analisar e classificar dados microarrays usando ciclos de redes de co-expressão gênica

Dillenburg, Fabiane Cristine January 2017 (has links)
Uma das principais áreas de pesquisa em Biologia de Sistemas refere-se à descoberta de redes biológicas a partir de conjuntos de dados de microarrays. Estas redes consistem de um grande número de genes cujos níveis de expressão afetam os outros genes de vários modos. Nesta tese, apresenta-se uma nova maneira de analisar os conjuntos de dados de microarrays, com base nos diferentes tipos de ciclos encontrados entre os genes das redes de co-expressão construídas com dados quantificados obtidos a partir dos microarrays. A entrada do método de análise é formada pelos dados brutos, um conjunto de genes de interesse (por exemplo, genes de uma via conhecida) e uma função (ativador ou inibidor) destes genes. A saída do método é um conjunto de ciclos. Um ciclo é um caminho fechado com todos os vértices (exceto o primeiro e o último) distintos. Graças à nova forma de encontrar relações entre os genes, é possível uma interpretação mais robusta das correlações dos genes, porque os ciclos estão associados a mecanismos de feedback, que são muito comuns em redes biológicas. A hipótese é que feedbacks negativos permitem encontrar relações entre os genes que podem ajudar a explicar a estabilidade do processo regulatório dentro da célula. Ciclos de feedback positivo, por outro lado, podem mostrar a quantidade de desequilíbrio de uma determinada célula em um determinado momento. A análise baseada em ciclos permite identificar a relação estequiométrica entre os genes da rede. Esta metodologia proporciona uma melhor compreensão da biologia do tumor. Portanto, as principais contribuições desta tese são: (i) um novo método de análise baseada em ciclos; (ii) um novo método de classificação; (iii) e, finalmente, aplicação dos métodos e a obtenção de resultados práticos. A metodologia proposta foi utilizada para analisar os genes de quatro redes fortemente relacionadas com o câncer - apoptose, glicólise, ciclo celular e NF B - em tecidos do tipo mais agressivo de tumor cerebral (Gliobastoma multiforme - GBM) e em tecidos cerebrais saudáveis. A maioria dos pacientes com GBM morrem em menos de um ano, essencialmente nenhum paciente tem sobrevivência a longo prazo, por isso estes tumores têm atraído atenção significativa. Os principais resultados nesta tese mostram que a relação estequiométrica entre genes envolvidos na apoptose, glicólise, ciclo celular e NF B está desequilibrada em amostras de GBM em comparação as amostras de controle. Este desequilíbrio pode ser medido e explicado pela identificação de um percentual maior de ciclos positivos nas redes das primeiras amostras. Esta conclusão ajuda a entender mais sobre a biologia deste tipo de tumor. O método de classificação baseado no ciclo proposto obteve as mesmas métricas de desempenho como uma rede neural, um método clássico de classificação. No entanto, o método proposto tem uma vantagem significativa em relação às redes neurais. O método de classificação proposto não só classifica as amostras, fornecendo diagnóstico, mas também explica porque as amostras foram classificadas de uma certa maneira em termos dos mecanismos de feedback que estão presentes/ausentes. Desta forma, o método fornece dicas para bioquímicos sobre possíveis experiências laboratoriais, bem como sobre potenciais genes alvo de terapias. / One of the main research areas in Systems Biology concerns the discovery of biological networks from microarray datasets. These networks consist of a great number of genes whose expression levels affect each other in various ways. We present a new way of analyzing microarray datasets, based on the different kind of cycles found among genes of the co-expression networks constructed using quantized data obtained from the microarrays. The input of the analysis method is formed by raw data, a set of interest genes (for example, genes from a known pathway) and a function (activator or inhibitor) of these genes. The output of the method is a set of cycles. A cycle is a closed walk, in which all vertices (except the first and last) are distinct. Thanks to the new way of finding relations among genes, a more robust interpretation of gene correlations is possible, because cycles are associated with feedback mechanisms that are very common in biological networks. Our hypothesis is that negative feedbacks allow finding relations among genes that may help explaining the stability of the regulatory process within the cell. Positive feedback cycles, on the other hand, may show the amount of imbalance of a certain cell in a given time. The cycle-based analysis allows identifying the stoichiometric relationship between the genes of the network. This methodology provides a better understanding of the biology of tumors. As a consequence, it may enable the development of more effective treatment therapies. Furthermore, cycles help differentiate, measure and explain the phenomena identified in healthy and diseased tissues. Cycles may also be used as a new method for classification of samples of a microarray (cancer diagnosis). Compared to other classification methods, cycle-based classification provides a richer explanation of the proposed classification, that can give hints on the possible therapies. Therefore, the main contributions of this thesis are: (i) a new cycle-based analysis method; (ii) a new microarray samples classification method; (iii) and, finally, application and achievement of practical results. We use the proposed methodology to analyze the genes of four networks closely related with cancer - apoptosis, glucolysis, cell cycle and NF B - in tissues of the most aggressive type of brain tumor (Gliobastoma multiforme – GBM) and in healthy tissues. Because most patients with GBMs die in less than a year, and essentially no patient has long-term survival, these tumors have drawn significant attention. Our main results show that the stoichiometric relationship between genes involved in apoptosis, glucolysis, cell cycle and NF B pathways is unbalanced in GBM samples versus control samples. This dysregulation can be measured and explained by the identification of a higher percentage of positive cycles in these networks. This conclusion helps to understand more about the biology of this tumor type. The proposed cycle-based classification method achieved the same performance metrics as a neural network, a classical classification method. However, our method has a significant advantage with respect to neural networks. The proposed classification method not only classifies samples, providing diagnosis, but also explains why samples were classified in a certain way in terms of the feedback mechanisms that are present/absent. This way, the method provides hints to biochemists about possible laboratory experiments, as well as on potential drug target genes.
24

An approach for analyzing and classifying microarray data using gene co-expression networks cycles / Uma abordagem para analisar e classificar dados microarrays usando ciclos de redes de co-expressão gênica

Dillenburg, Fabiane Cristine January 2017 (has links)
Uma das principais áreas de pesquisa em Biologia de Sistemas refere-se à descoberta de redes biológicas a partir de conjuntos de dados de microarrays. Estas redes consistem de um grande número de genes cujos níveis de expressão afetam os outros genes de vários modos. Nesta tese, apresenta-se uma nova maneira de analisar os conjuntos de dados de microarrays, com base nos diferentes tipos de ciclos encontrados entre os genes das redes de co-expressão construídas com dados quantificados obtidos a partir dos microarrays. A entrada do método de análise é formada pelos dados brutos, um conjunto de genes de interesse (por exemplo, genes de uma via conhecida) e uma função (ativador ou inibidor) destes genes. A saída do método é um conjunto de ciclos. Um ciclo é um caminho fechado com todos os vértices (exceto o primeiro e o último) distintos. Graças à nova forma de encontrar relações entre os genes, é possível uma interpretação mais robusta das correlações dos genes, porque os ciclos estão associados a mecanismos de feedback, que são muito comuns em redes biológicas. A hipótese é que feedbacks negativos permitem encontrar relações entre os genes que podem ajudar a explicar a estabilidade do processo regulatório dentro da célula. Ciclos de feedback positivo, por outro lado, podem mostrar a quantidade de desequilíbrio de uma determinada célula em um determinado momento. A análise baseada em ciclos permite identificar a relação estequiométrica entre os genes da rede. Esta metodologia proporciona uma melhor compreensão da biologia do tumor. Portanto, as principais contribuições desta tese são: (i) um novo método de análise baseada em ciclos; (ii) um novo método de classificação; (iii) e, finalmente, aplicação dos métodos e a obtenção de resultados práticos. A metodologia proposta foi utilizada para analisar os genes de quatro redes fortemente relacionadas com o câncer - apoptose, glicólise, ciclo celular e NF B - em tecidos do tipo mais agressivo de tumor cerebral (Gliobastoma multiforme - GBM) e em tecidos cerebrais saudáveis. A maioria dos pacientes com GBM morrem em menos de um ano, essencialmente nenhum paciente tem sobrevivência a longo prazo, por isso estes tumores têm atraído atenção significativa. Os principais resultados nesta tese mostram que a relação estequiométrica entre genes envolvidos na apoptose, glicólise, ciclo celular e NF B está desequilibrada em amostras de GBM em comparação as amostras de controle. Este desequilíbrio pode ser medido e explicado pela identificação de um percentual maior de ciclos positivos nas redes das primeiras amostras. Esta conclusão ajuda a entender mais sobre a biologia deste tipo de tumor. O método de classificação baseado no ciclo proposto obteve as mesmas métricas de desempenho como uma rede neural, um método clássico de classificação. No entanto, o método proposto tem uma vantagem significativa em relação às redes neurais. O método de classificação proposto não só classifica as amostras, fornecendo diagnóstico, mas também explica porque as amostras foram classificadas de uma certa maneira em termos dos mecanismos de feedback que estão presentes/ausentes. Desta forma, o método fornece dicas para bioquímicos sobre possíveis experiências laboratoriais, bem como sobre potenciais genes alvo de terapias. / One of the main research areas in Systems Biology concerns the discovery of biological networks from microarray datasets. These networks consist of a great number of genes whose expression levels affect each other in various ways. We present a new way of analyzing microarray datasets, based on the different kind of cycles found among genes of the co-expression networks constructed using quantized data obtained from the microarrays. The input of the analysis method is formed by raw data, a set of interest genes (for example, genes from a known pathway) and a function (activator or inhibitor) of these genes. The output of the method is a set of cycles. A cycle is a closed walk, in which all vertices (except the first and last) are distinct. Thanks to the new way of finding relations among genes, a more robust interpretation of gene correlations is possible, because cycles are associated with feedback mechanisms that are very common in biological networks. Our hypothesis is that negative feedbacks allow finding relations among genes that may help explaining the stability of the regulatory process within the cell. Positive feedback cycles, on the other hand, may show the amount of imbalance of a certain cell in a given time. The cycle-based analysis allows identifying the stoichiometric relationship between the genes of the network. This methodology provides a better understanding of the biology of tumors. As a consequence, it may enable the development of more effective treatment therapies. Furthermore, cycles help differentiate, measure and explain the phenomena identified in healthy and diseased tissues. Cycles may also be used as a new method for classification of samples of a microarray (cancer diagnosis). Compared to other classification methods, cycle-based classification provides a richer explanation of the proposed classification, that can give hints on the possible therapies. Therefore, the main contributions of this thesis are: (i) a new cycle-based analysis method; (ii) a new microarray samples classification method; (iii) and, finally, application and achievement of practical results. We use the proposed methodology to analyze the genes of four networks closely related with cancer - apoptosis, glucolysis, cell cycle and NF B - in tissues of the most aggressive type of brain tumor (Gliobastoma multiforme – GBM) and in healthy tissues. Because most patients with GBMs die in less than a year, and essentially no patient has long-term survival, these tumors have drawn significant attention. Our main results show that the stoichiometric relationship between genes involved in apoptosis, glucolysis, cell cycle and NF B pathways is unbalanced in GBM samples versus control samples. This dysregulation can be measured and explained by the identification of a higher percentage of positive cycles in these networks. This conclusion helps to understand more about the biology of this tumor type. The proposed cycle-based classification method achieved the same performance metrics as a neural network, a classical classification method. However, our method has a significant advantage with respect to neural networks. The proposed classification method not only classifies samples, providing diagnosis, but also explains why samples were classified in a certain way in terms of the feedback mechanisms that are present/absent. This way, the method provides hints to biochemists about possible laboratory experiments, as well as on potential drug target genes.
25

Prediction of mammalian essential genes based on sequence and functional features

Kabir, Mitra January 2017 (has links)
Essential genes are those whose presence is imperative for an organism's survival, whereas the functions of non-essential genes may be useful but not critical. Abnormal functionality of essential genes may lead to defects or death at an early stage of life. Knowledge of essential genes is therefore key to understanding development, maintenance of major cellular processes and tissue-specific functions that are crucial for life. Existing experimental techniques for identifying essential genes are accurate, but most of them are time consuming and expensive. Predicting essential genes using computational methods, therefore, would be of great value as they circumvent experimental constraints. Our research is based on the hypothesis that mammalian essential (lethal) and non-essential (viable) genes are distinguishable by various properties. We examined a wide range of features of Mus musculus genes, including sequence, protein-protein interactions, gene expression and function, and found 75 features that were statistically discriminative between lethal and viable genes. These features were used as inputs to create a novel machine learning classifier, allowing the prediction of a mouse gene as lethal or viable with the cross-validation and blind test accuracies of ∼91% and ∼93%, respectively. The prediction results are promising, indicating that our classifier is an effective mammalian essential gene prediction method. We further developed the mouse gene essentiality study by analysing the association between essentiality and gene duplication. Mouse genes were labelled as singletons or duplicates, and their expression patterns over 13 developmental stages were examined. We found that lethal genes originating from duplicates are considerably lower in proportion than singletons. At all developmental stages a significantly higher proportion of singletons and lethal genes are expressed than duplicates and viable genes. Lethal genes were also found to be more ancient than viable genes. In addition, we observed that duplicate pairs with similar patterns of developmental co-expression are more likely to be viable; lethal gene duplicate pairs do not have such a trend. Overall, these results suggest that duplicate genes in mouse are less likely to be essential than singletons. Finally, we investigated the evolutionary age of mouse genes across development to see if the morphological hourglass pattern exists in the mouse. We found that in mouse embryos, genes expressed in early and late stages are evolutionarily younger than those expressed in mid-embryogenesis, thus yielding an hourglass pattern. However, the oldest genes are not expressed at the phylotypic stage stated in prior studies, but instead at an earlier time point - the egg cylinder stage. These results question the application of the hourglass model to mouse development.
26

Développement d'outils statistiques pour l'analyse de données transcriptomiques par les réseaux de co-expression de gènes / A systemic approach to statistical analysis to transcriptomic data through co-expression network analysis

Brunet, Anne-Claire 17 June 2016 (has links)
Les nouvelles biotechnologies offrent aujourd'hui la possibilité de récolter une très grande variété et quantité de données biologiques (génomique, protéomique, métagénomique...), ouvrant ainsi de nouvelles perspectives de recherche pour la compréhension des processus biologiques. Dans cette thèse, nous nous sommes plus spécifiquement intéressés aux données transcriptomiques, celles-ci caractérisant l'activité ou le niveau d'expression de plusieurs dizaines de milliers de gènes dans une cellule donnée. L'objectif était alors de proposer des outils statistiques adaptés pour analyser ce type de données qui pose des problèmes de "grande dimension" (n<<p), car collectées sur des échantillons de tailles très limitées au regard du très grand nombre de variables (ici l'expression des gènes).La première partie de la thèse est consacrée à la présentation de méthodes d'apprentissage supervisé, telles que les forêts aléatoires de Breiman et les modèles de régressions pénalisées, utilisées dans le contexte de la grande dimension pour sélectionner les gènes (variables d'expression) qui sont les plus pertinents pour l'étude de la pathologie d'intérêt. Nous évoquons les limites de ces méthodes pour la sélection de gènes qui soient pertinents, non pas uniquement pour des considérations d'ordre statistique, mais qui le soient également sur le plan biologique, et notamment pour les sélections au sein des groupes de variables fortement corrélées, c'est à dire au sein des groupes de gènes co-exprimés. Les méthodes d'apprentissage classiques considèrent que chaque gène peut avoir une action isolée dans le modèle, ce qui est en pratique peu réaliste. Un caractère biologique observable est la résultante d'un ensemble de réactions au sein d'un système complexe faisant interagir les gènes les uns avec les autres, et les gènes impliqués dans une même fonction biologique ont tendance à être co-exprimés (expression corrélée). Ainsi, dans une deuxième partie, nous nous intéressons aux réseaux de co-expression de gènes sur lesquels deux gènes sont reliés si ils sont co-exprimés. Plus précisément, nous cherchons à mettre en évidence des communautés de gènes sur ces réseaux, c'est à dire des groupes de gènes co-exprimés, puis à sélectionner les communautés les plus pertinentes pour l'étude de la pathologie, ainsi que les "gènes clés" de ces communautés. Cela favorise les interprétations biologiques, car il est souvent possible d'associer une fonction biologique à une communauté de gènes. Nous proposons une approche originale et efficace permettant de traiter simultanément la problématique de la modélisation du réseau de co-expression de gènes et celle de la détection des communautés de gènes sur le réseau. Nous mettons en avant les performances de notre approche en la comparant à des méthodes existantes et populaires pour l'analyse des réseaux de co-expression de gènes (WGCNA et méthodes spectrales). Enfin, par l'analyse d'un jeu de données réelles, nous montrons dans la dernière partie de la thèse que l'approche que nous proposons permet d'obtenir des résultats convaincants sur le plan biologique, plus propices aux interprétations et plus robustes que ceux obtenus avec les méthodes d'apprentissage supervisé classiques. / Today's, new biotechnologies offer the opportunity to collect a large variety and volume of biological data (genomic, proteomic, metagenomic...), thus opening up new avenues for research into biological processes. In this thesis, what we are specifically interested is the transcriptomic data indicative of the activity or expression level of several thousands of genes in a given cell. The aim of this thesis was to propose proper statistical tools to analyse these high dimensional data (n<<p) collected from small samples with regard to the very large number of variables (gene expression variables). The first part of the thesis is devoted to a description of some supervised learning methods, such as random forest and penalized regression models. The following methods can be used for selecting the most relevant disease-related genes. However, the statistical relevance of the selections doesn't determine the biological relevance, and particularly when genes are selected within a group of highly correlated variables or co-expressed genes. Common supervised learning methods consider that every gene can have an isolated action in the model which is not so much realistic. An observable biological phenomenum is the result of a set of reactions inside a complex system which makes genes interact with each other, and genes that have a common biological function tend to be co-expressed (correlation between expression variables). Then, in a second part, we are interested in gene co-expression networks, where genes are linked if they are co-expressed. More precisely, we aim to identify communities of co-expressed genes, and then to select the most relevant disease-related communities as well as the "key-genes" of these communities. It leads to a variety of biological interpretations, because a community of co-expressed genes is often associated with a specific biological function. We propose an original and efficient approach that permits to treat simultaneously the problem of modeling the gene co-expression network and the problem of detecting the communities in network. We put forward the performances of our approach by comparing it to the existing methods that are popular for analysing gene co-expression networks (WGCNA and spectral approaches). The last part presents the results produced by applying our proposed approach on a real-world data set. We obtain convincing and robust results that help us make more diverse biological interpretations than with results produced by common supervised learning methods.
27

An approach for analyzing and classifying microarray data using gene co-expression networks cycles / Uma abordagem para analisar e classificar dados microarrays usando ciclos de redes de co-expressão gênica

Dillenburg, Fabiane Cristine January 2017 (has links)
Uma das principais áreas de pesquisa em Biologia de Sistemas refere-se à descoberta de redes biológicas a partir de conjuntos de dados de microarrays. Estas redes consistem de um grande número de genes cujos níveis de expressão afetam os outros genes de vários modos. Nesta tese, apresenta-se uma nova maneira de analisar os conjuntos de dados de microarrays, com base nos diferentes tipos de ciclos encontrados entre os genes das redes de co-expressão construídas com dados quantificados obtidos a partir dos microarrays. A entrada do método de análise é formada pelos dados brutos, um conjunto de genes de interesse (por exemplo, genes de uma via conhecida) e uma função (ativador ou inibidor) destes genes. A saída do método é um conjunto de ciclos. Um ciclo é um caminho fechado com todos os vértices (exceto o primeiro e o último) distintos. Graças à nova forma de encontrar relações entre os genes, é possível uma interpretação mais robusta das correlações dos genes, porque os ciclos estão associados a mecanismos de feedback, que são muito comuns em redes biológicas. A hipótese é que feedbacks negativos permitem encontrar relações entre os genes que podem ajudar a explicar a estabilidade do processo regulatório dentro da célula. Ciclos de feedback positivo, por outro lado, podem mostrar a quantidade de desequilíbrio de uma determinada célula em um determinado momento. A análise baseada em ciclos permite identificar a relação estequiométrica entre os genes da rede. Esta metodologia proporciona uma melhor compreensão da biologia do tumor. Portanto, as principais contribuições desta tese são: (i) um novo método de análise baseada em ciclos; (ii) um novo método de classificação; (iii) e, finalmente, aplicação dos métodos e a obtenção de resultados práticos. A metodologia proposta foi utilizada para analisar os genes de quatro redes fortemente relacionadas com o câncer - apoptose, glicólise, ciclo celular e NF B - em tecidos do tipo mais agressivo de tumor cerebral (Gliobastoma multiforme - GBM) e em tecidos cerebrais saudáveis. A maioria dos pacientes com GBM morrem em menos de um ano, essencialmente nenhum paciente tem sobrevivência a longo prazo, por isso estes tumores têm atraído atenção significativa. Os principais resultados nesta tese mostram que a relação estequiométrica entre genes envolvidos na apoptose, glicólise, ciclo celular e NF B está desequilibrada em amostras de GBM em comparação as amostras de controle. Este desequilíbrio pode ser medido e explicado pela identificação de um percentual maior de ciclos positivos nas redes das primeiras amostras. Esta conclusão ajuda a entender mais sobre a biologia deste tipo de tumor. O método de classificação baseado no ciclo proposto obteve as mesmas métricas de desempenho como uma rede neural, um método clássico de classificação. No entanto, o método proposto tem uma vantagem significativa em relação às redes neurais. O método de classificação proposto não só classifica as amostras, fornecendo diagnóstico, mas também explica porque as amostras foram classificadas de uma certa maneira em termos dos mecanismos de feedback que estão presentes/ausentes. Desta forma, o método fornece dicas para bioquímicos sobre possíveis experiências laboratoriais, bem como sobre potenciais genes alvo de terapias. / One of the main research areas in Systems Biology concerns the discovery of biological networks from microarray datasets. These networks consist of a great number of genes whose expression levels affect each other in various ways. We present a new way of analyzing microarray datasets, based on the different kind of cycles found among genes of the co-expression networks constructed using quantized data obtained from the microarrays. The input of the analysis method is formed by raw data, a set of interest genes (for example, genes from a known pathway) and a function (activator or inhibitor) of these genes. The output of the method is a set of cycles. A cycle is a closed walk, in which all vertices (except the first and last) are distinct. Thanks to the new way of finding relations among genes, a more robust interpretation of gene correlations is possible, because cycles are associated with feedback mechanisms that are very common in biological networks. Our hypothesis is that negative feedbacks allow finding relations among genes that may help explaining the stability of the regulatory process within the cell. Positive feedback cycles, on the other hand, may show the amount of imbalance of a certain cell in a given time. The cycle-based analysis allows identifying the stoichiometric relationship between the genes of the network. This methodology provides a better understanding of the biology of tumors. As a consequence, it may enable the development of more effective treatment therapies. Furthermore, cycles help differentiate, measure and explain the phenomena identified in healthy and diseased tissues. Cycles may also be used as a new method for classification of samples of a microarray (cancer diagnosis). Compared to other classification methods, cycle-based classification provides a richer explanation of the proposed classification, that can give hints on the possible therapies. Therefore, the main contributions of this thesis are: (i) a new cycle-based analysis method; (ii) a new microarray samples classification method; (iii) and, finally, application and achievement of practical results. We use the proposed methodology to analyze the genes of four networks closely related with cancer - apoptosis, glucolysis, cell cycle and NF B - in tissues of the most aggressive type of brain tumor (Gliobastoma multiforme – GBM) and in healthy tissues. Because most patients with GBMs die in less than a year, and essentially no patient has long-term survival, these tumors have drawn significant attention. Our main results show that the stoichiometric relationship between genes involved in apoptosis, glucolysis, cell cycle and NF B pathways is unbalanced in GBM samples versus control samples. This dysregulation can be measured and explained by the identification of a higher percentage of positive cycles in these networks. This conclusion helps to understand more about the biology of this tumor type. The proposed cycle-based classification method achieved the same performance metrics as a neural network, a classical classification method. However, our method has a significant advantage with respect to neural networks. The proposed classification method not only classifies samples, providing diagnosis, but also explains why samples were classified in a certain way in terms of the feedback mechanisms that are present/absent. This way, the method provides hints to biochemists about possible laboratory experiments, as well as on potential drug target genes.
28

Atypical Solute Carriers : Identification, evolutionary conservation, structure and histology of novel membrane-bound transporters

Perland, Emelie January 2017 (has links)
Solute carriers (SLCs) constitute the largest family of membrane-bound transporter proteins in humans, and they convey transport of nutrients, ions, drugs and waste over cellular membranes via facilitative diffusion, co-transport or exchange. Several SLCs are associated with diseases and their location in membranes and specific substrate transport makes them excellent as drug targets. However, as 30 % of the 430 identified SLCs are still orphans, there are yet numerous opportunities to explain diseases and discover potential drug targets. Among the novel proteins are 29 atypical SLCs of major facilitator superfamily (MFS) type. These share evolutionary history with the remaining SLCs, but are orphans regarding expression, structure and/or function. They are not classified into any of the existing 52 SLC families. The overall aim in this thesis was to study the atypical SLCs with a focus on their phylogenetic clustering, evolutionary conservation, structure, protein expression in mouse brains and if and how their gene expressions were affected upon changed food intake. In Papers I-III, the focus was on specific proteins, MFSD5 and MFSD11 (Paper I), MFSD1 and MFSD3 (Paper II), and MFSD4A and MFSD9 (Paper III). They all shared neuronal expression, and their transcription levels were altered in several brain areas after subjecting mice to food deprivation or a high-fat diet. In Paper IV, the 29 atypical SLCs of MFS type were examined. They were divided into 15 families, based on phylogenetic analyses and sequence identities, to facilitate functional studies. Their sequence relationships with other SLCs were also established. Some of the proteins were found to be well conserved with orthologues down to nematodes and insects, whereas others emerged at first in vertebrates. The atypical SLCs of MFS type were predicted to have the common MFS structure, composed of 12 transmembrane segments. With single-cell RNA sequencing and in situ proximity ligation assay, co-expression of atypical SLCs was analysed to get a comprehensive understanding of how membrane-bound transporters interact.   In conclusion, the atypical SLCs of MFS type are suggested to be novel SLC transporters, involved in maintaining nutrient homeostasis through substrate transport.
29

Novel methods for constructing, combining and comparing co-expression networks: Towards uncovering the molecular basis of human cognition

Morselli Gysi, Deisy 11 April 2019 (has links)
Network analyses, such as gene co-expression networks are an important approach for the systems-level study of biological data. For example, understanding patterns of \linebreak co-regulation in mental disorders can contribute to the development of new therapies and treatments. In a gene regulatory process a particular TF or ncRNA can up- or down-regulate other genes, therefore it is important to explicitly consider both positive and negative interactions. Although exists a variety of software and libraries for constructing and investigating such networks, none considers the sign of interaction. It is also required that the represented networks have high accuracy, where the interactions found have to be relevant and not found by chance or background noise. Another issue derived from building co-expression networks is the reproducibility of those. When constructing independent networks for the same phenotype, though, using different expression datasets, the output network can be remarkably distinct due to biological or technical noise in the data. However, most of the times the interest is not only to characterise a network but to compare its features to others. A series of questions arise from understanding phenotypes using co-expression networks: i) how to construct highly accurate networks; ii) how to combine multiple networks derived from different platforms; iii) how to compare multiple networks. For answering those questions, i) I improved the wTO method to construct highly accurate networks, where now each interaction in a network receives a probability. This method showed to be much more efficient in finding correct interactions than other well-known methods; ii) I developed a method that is able to combine multiple networks into one building a CN. This method enables the correction for background noise; iii) I developed a completely novel method for the comparison of multiple co-expression networks, CoDiNA. This method identifies genes specific to at least one network. It is natural that after associating genes to phenotypes, an inference whether those genes are enriched for a particular disorder is needed. I also present here a tool, RichR, that enables enrichment analysis and background correction. I applied the methods proposed here in two important studies. In the first one, the aim was to understand the neurogenesis process and how certain genes would affect it. The combination of the methods shown here pointed one particular TF, ZN787, as playing an important role in this process. Moreover, the application of this toolset to networks derived from brain samples of individuals with cognitive disorders identified genes and network connections that are specific to certain disorders, but also found an overlap between neurodegenerative disorders and brain development and between evolutionary changes and psychological disorders. CoDiNA also pointed out that there are genes involved in those disorders that are not only human-specific.
30

The Subcellular Localization and Protein-protein Interactions of Barley Mixed-Linkage-(1->3),(1->4)-ß-D-Glucan Synthase CSLF6 and CSLH1

Zhou, Yadi January 2018 (has links)
No description available.

Page generated in 0.1041 seconds