• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 19
  • 19
  • 16
  • 7
  • 7
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Contributions to Gene Set Analysis of Correlated, Paired-Sample Transcriptome Data to Enable Precision Medicine

Schissler, Alfred Grant, Schissler, Alfred Grant January 2017 (has links)
This dissertation serves as a unifying document for three related articles developed during my dissertation research. The projects involve the development of single-subject transcriptome (i.e. gene expression data) methodology for precision medicine and related applications. Traditional statistical approaches are largely unavailable in this setting due to prohibitive sample size and lack of independent replication. This leads one to rely on informatic devices including knowledgebase integration (e.g., gene set annotations) and external data sources (e.g., estimation of inter-gene correlation). Common statistical themes include multivariate statistics (such as Mahalanobis distance and copulas) and large-scale significance testing. Briefly, the first work describes the development of clinically relevant single-subject metrics of gene set (pathway) differential expression, N-of-1-pathways Mahalanobis distance (MD) scores. Next, the second article describes a method which overcomes a major shortcoming of the MD framework by accounting for inter-gene correlation. Lastly, the statistics developed in the previous works are re-purposed to analyze single-cell RNA-sequencing data derived from rare cells. Importantly, these works represent an interdisciplinary effort and show that creative solutions for pressing issues become possible at the intersection of statistics, biology, medicine, and computer science.
2

Gene-pair based statistical methods for testing gene set enrichment in microarray gene expression studies

Zhao, Kaiqiong 16 September 2016 (has links)
Gene set enrichment analysis aims to discover sets of genes, such as biological pathways or protein complexes, which may show moderate but coordinated differentiation across experimental conditions. The existing gene set enrichment approaches utilize single gene statistic as a measure of differentiation for individual genes. These approaches do not utilize any inter-gene correlations, but it has been known that genes in a pathway often interact with each other. Motivated by the need for taking gene dependence into account, we propose a novel gene set enrichment algorithm, where the gene-gene correlation is addressed via a gene-pair representation strategy. Relying on an appropriately defined gene pair statistic, the gene set statistic is formulated using a competitive null hypothesis. Extensive simulation studies show that our proposed approach can correctly control the type I error (false positive rate), and retain good statistical power for detecting true differential expression. The new method is also applied to analyze several gene expression datasets. / October 2016
3

Comparing Performance of Gene Set Test Methods Using Biologically Relevant Simulated Data

Lambert, Richard M. 01 December 2018 (has links)
Today we know that there are many genetically driven diseases and health conditions.These problems often manifest only when a set of genes are either active or inactive. Recent technology allows us to measure the activity level of genes in cells, which we call gene expression. It is of great interest to society to be able to statistically compare the gene expression of a large number of genes between two or more groups. For example, we may want to compare the gene expression of a group of cancer patients with a group of non-cancer patients to better understand the genetic causes of that particular cancer. Understanding these genetic causes could potentially lead to improved treatment options. Initially, gene expression was tested on a per gene level for statistical difference. In more recent years, it has been determined that grouping genes together by biological processes into gene sets and comparing groups at the gene set level probably makes more sense biologically. A number of gene set test methods have since been developed. It is critically important that we know if these gene set test methods are accurate. In this research, we compare the accuracy of a group of popular gene set test methods across a range of biologically realistic scenarios. In order to measure accuracy, we need to know whether each gene set is differentially expressed or not. Since this is not possible in real gene expression data, we use simulated data. We develop a simulation framework that generates gene expression data that is representative of actual gene expression data and use it to test each gene set method over a range of biologically relevant scenarios. We then compare the power and false discovery rate of each method across these scenarios.
4

Knowledge Based Gene Set analysis (KB-GSA) : A novel method for gene expression analysis

Jadhav, Trishul January 2010 (has links)
Microarray technology allows measurement of the expression levels of thousand of genes simultaneously. Several gene set analysis (GSA) methods are widely used for extracting useful information from microarrays, for example identifying differentially expressed pathways associated with a particular biological process or disease phenotype. Though GSA methods like Gene Set Enrichment Analysis (GSEA) are widely used for pathway analysis, these methods are solely based on statistics. Such methods can be awkward to use if knowledge of specific pathways involved in particular biological processes are the aim of the study. Here we present a novel method (Knowledge Based Gene Set Analysis: KB-GSA) which integrates knowledge about user-selected pathways that are known to be involved in specific biological processes. The method generates an easy to understand graphical visualization of the changes in expression of the genes, complemented with some common statistics about the pathway of particular interest.
5

Knowledge Based Gene Set analysis (KB-GSA) : A novel method for gene expression analysis

Jadhav, Trishul January 2010 (has links)
<p>Microarray technology allows measurement of the expression levels of thousand of genes simultaneously. Several gene set analysis (GSA) methods are widely used for extracting useful information from microarrays, for example identifying differentially expressed pathways associated with a particular biological process or disease phenotype. Though GSA methods like Gene Set Enrichment Analysis (GSEA) are widely used for pathway analysis, these methods are solely based on statistics. Such methods can be awkward to use if knowledge of specific pathways involved in particular biological processes are the aim of the study. Here we present a novel method <strong><em>(Knowledge Based Gene Set Analysis: KB-GSA</em></strong>) which integrates knowledge about user-selected pathways that are known to be involved in specific biological processes. The method generates an easy to understand graphical visualization of the changes in expression of the genes, complemented with some common statistics about the pathway of particular interest.</p>
6

Identificação de cascatas gênicas com base na modulação transcricional de células sanguíneas mononucleares periféricas de pacientes com diabetes mellitus do tipo 1 / Identification of gene cascades based on the transcriptional modulation of peripheral blood mononuclear cells from type 1 diabetes mellitus patients.

Arns, Thais Cristine 15 March 2013 (has links)
O diabetes mellitus do tipo 1 (DM1) é uma doença autoimune crônica, durante a qual as células beta pancreáticas, responsáveis pela secreção de insulina, são seletivamente destruídas. O desenvolvimento desta doença é uma consequência da predisposição genética combinada a fatores ambientais largamente desconhecidos e eventos estocásticos. Neste trabalho foi proposta a comparação da expressão gênica transcricional em grande escala (transcriptoma) entre amostras de pacientes de DM1 e controles, obtidas a partir de células mononucleares do sangue periférico (PBMCs). As alterações resultantes na expressão gênica causada pela doença podem ser amostradas em PBMCs, uma vez que as células imunes efetoras estão presumivelmente em equilíbrio com a população celular circulante. A fim de identificar alterações na expressão gênica, foram utilizados métodos analíticos como a tecnologia de microarrays e o cálculo do coeficiente de correlação de Pearson, sendo possível observar aumento ou diminuição na expressão gênica e também a magnitude desta mudança. Além disso, foi realizada análise de grupos gênicos (gene sets ou GSA), método baseado na significância de conjuntos gênicos pré-definidos, ao invés de genes individuais. Este procedimento é mais adequado para análise de uma doença poligênica, tal como o DM1. A análise de GSA possibilitou a seleção de genes envolvidos, por exemplo, nas seguintes vias: cascata de I-kappaB kinase/NF-kappaB, regulação da via de sinalização do receptor de TGF-ß, regulação da cascata de JAK-STAT e via de sinalização mediada por citocinas e quimiocinas, das quais podem ser identificados marcadores transcricionais. A análise imparcial do transcriptoma de PBMCs permitiu a identificação de gene sets e genes associados ao DM1, seu perfil de expressão preferencial em tipos celulares do sistema imune e seus padrões de modulação. / Type 1 diabetes mellitus (T1DM) is a chronic autoimmune disease, in which the pancreatic beta cells responsible for secretion of insulin are selectively destroyed. The development of this disease is a result of genetic predisposition combined with largely unknown environmental factors and stochastic events. In this work it was proposed to compare the large scale transcriptional gene expression (transcriptome) between samples obtained from T1DM patients and healthy controls, obtained from peripheral blood mononuclear cells (PBMCs). The resulting changes in gene expression caused by the disease can be sampled in PBMCs, as immune effector cells are presumably in equilibrium with the circulating cell population. In order to identify changes in gene expression, we used analytical methods such as microarray technology and calculating the Pearson correlation coefficient, where it was possible to observe increases or decreases in gene expression and also the magnitude of change. Furthermore, we performed a gene set analysis (GSA) method based on the significance of predefined gene sets instead of individual genes. This procedure is more suitable for analyzing a polygenic disease such as T1DM. GSA analysis enabled the selection of genes involved for example, in the following pathways: I-kappaB kinase/NF-kappaB cascade, regulation of TGF-ß receptor signaling pathway, regulation of JAK-STAT cascade and cytokine and chemokine mediated signaling pathway, from which transcriptional markers can be identified. An unbiased transcriptome analysis of PBMCs allowed the identification of gene sets and genes associated with T1DM, its preferential expression profile in cell types of the immune system and its modulation patterns.
7

Identificação de cascatas gênicas com base na modulação transcricional de células sanguíneas mononucleares periféricas de pacientes com diabetes mellitus do tipo 1 / Identification of gene cascades based on the transcriptional modulation of peripheral blood mononuclear cells from type 1 diabetes mellitus patients.

Thais Cristine Arns 15 March 2013 (has links)
O diabetes mellitus do tipo 1 (DM1) é uma doença autoimune crônica, durante a qual as células beta pancreáticas, responsáveis pela secreção de insulina, são seletivamente destruídas. O desenvolvimento desta doença é uma consequência da predisposição genética combinada a fatores ambientais largamente desconhecidos e eventos estocásticos. Neste trabalho foi proposta a comparação da expressão gênica transcricional em grande escala (transcriptoma) entre amostras de pacientes de DM1 e controles, obtidas a partir de células mononucleares do sangue periférico (PBMCs). As alterações resultantes na expressão gênica causada pela doença podem ser amostradas em PBMCs, uma vez que as células imunes efetoras estão presumivelmente em equilíbrio com a população celular circulante. A fim de identificar alterações na expressão gênica, foram utilizados métodos analíticos como a tecnologia de microarrays e o cálculo do coeficiente de correlação de Pearson, sendo possível observar aumento ou diminuição na expressão gênica e também a magnitude desta mudança. Além disso, foi realizada análise de grupos gênicos (gene sets ou GSA), método baseado na significância de conjuntos gênicos pré-definidos, ao invés de genes individuais. Este procedimento é mais adequado para análise de uma doença poligênica, tal como o DM1. A análise de GSA possibilitou a seleção de genes envolvidos, por exemplo, nas seguintes vias: cascata de I-kappaB kinase/NF-kappaB, regulação da via de sinalização do receptor de TGF-ß, regulação da cascata de JAK-STAT e via de sinalização mediada por citocinas e quimiocinas, das quais podem ser identificados marcadores transcricionais. A análise imparcial do transcriptoma de PBMCs permitiu a identificação de gene sets e genes associados ao DM1, seu perfil de expressão preferencial em tipos celulares do sistema imune e seus padrões de modulação. / Type 1 diabetes mellitus (T1DM) is a chronic autoimmune disease, in which the pancreatic beta cells responsible for secretion of insulin are selectively destroyed. The development of this disease is a result of genetic predisposition combined with largely unknown environmental factors and stochastic events. In this work it was proposed to compare the large scale transcriptional gene expression (transcriptome) between samples obtained from T1DM patients and healthy controls, obtained from peripheral blood mononuclear cells (PBMCs). The resulting changes in gene expression caused by the disease can be sampled in PBMCs, as immune effector cells are presumably in equilibrium with the circulating cell population. In order to identify changes in gene expression, we used analytical methods such as microarray technology and calculating the Pearson correlation coefficient, where it was possible to observe increases or decreases in gene expression and also the magnitude of change. Furthermore, we performed a gene set analysis (GSA) method based on the significance of predefined gene sets instead of individual genes. This procedure is more suitable for analyzing a polygenic disease such as T1DM. GSA analysis enabled the selection of genes involved for example, in the following pathways: I-kappaB kinase/NF-kappaB cascade, regulation of TGF-ß receptor signaling pathway, regulation of JAK-STAT cascade and cytokine and chemokine mediated signaling pathway, from which transcriptional markers can be identified. An unbiased transcriptome analysis of PBMCs allowed the identification of gene sets and genes associated with T1DM, its preferential expression profile in cell types of the immune system and its modulation patterns.
8

Bayesian pathway analysis in epigenetics

Wright, Alan January 2013 (has links)
A typical gene expression data set consists of measurements of a large number of gene expressions, on a relatively small number of subjects, classified according to two or more outcomes, for example cancer or non-cancer. The identification of associations between gene expressions and outcome is a huge multiple testing problem. Early approaches to this problem involved the application of thousands of univariate tests with corrections for multiplicity. Over the past decade, numerous studies have demonstrated that analyzing gene expression data structured into predefined gene sets can produce benefits in terms of statistical power and robustness when compared to alternative approaches. This thesis presents the results of research on gene set analysis. In particular, it examines the properties of some existing methods for the analysis of gene sets. It introduces novel Bayesian methods for gene set analysis. A distinguishing feature of these methods is that the model is specified conditionally on the expression data, whereas other methods of gene set analysis and IGA generally make inferences conditionally on the outcome. Computer simulation is used to compare three common established methods for gene set analysis. In this simulation study a new procedure for the simulation of gene expression data is introduced. The simulation studies are used to identify situations in which the established methods perform poorly. The Bayesian approaches developed in this thesis apply reversible jump Markov chain Monte Carlo (RJMCMC) techniques to model gene expression effects on phenotype. The reversible jump step in the modelling procedure allows for posterior probabilities for activeness of gene set to be produced. These mixture models reverse the generally accepted conditionality and model outcome given gene expression, which is a more intuitive assumption when modelling the pathway to phenotype. It is demonstrated that the two models proposed may be superior to the established methods studied. There is considerable scope for further development of this line of research, which is appealing in terms of the use of mixture model priors that reflect the belief that a relatively small number of genes, restricted to a small number of gene sets, are associated with the outcome.
9

Network-based strategies for discovering functional associations of uncharacterized genes and gene sets

Wang, Peggy I. 12 November 2013 (has links)
High-throughput technology is changing the face of research biology, generating an ever growing amount of large-scale data sets. With experiments utilizing next-generation gene sequencing, mass spectrometry, and various other global surveys of proteins, the task of translating the plethora of data into biology has become a daunting task. In response, functional networks have been developed as a means for integrating the data into models of proteomic organization. In these networks, proteins are linked if they are evidenced to operate together in the same function, facilitating predictions about the functions, phenotypes, and disease associations of uncharacterized genes. In this body of work, we explore different applications of this so-called "guilt-by-association" concept to predict loss-of-function phenotypes and diseases associated with genes in yeast, worm, and human. We also scrutinize certain limitations associated with the functional networks, predictive methods, and measures of performance used in our studies. Importantly, the predictive method and performance measure, if not chosen appropriately for the biological objective at hand, can largely distort the results and interpretation of a study. These findings are incorporated in the development of RIDDLE, a method for characterizing whole sets of genes. This machine learning-based method provides a measure of network distance, and thus functional association, between two sets of genes. RIDDLE may be applied to a wide range of potential applications, as we demonstrate with several biological examples, including linking microRNA-450a to ocular development and disease. In the last decade, functional networks have proven to be a useful strategy for interpreting large-scale proteomic and genomic data sets. With the continued growth of genome coverage in networks and the innovation of predictive methods, we will surely advance towards our ultimate goal of understanding the genetic changes that underlie disease. / text
10

Analyzing Gene Expression Data in Terms of Gene Sets: Gene Set Enrichment Analysis

Li, Wei 01 December 2009 (has links)
The DNA microarray biotechnology simultaneously monitors the expression of thousands of genes and aims to identify genes that are differently expressed under different conditions. From the statistical point of view, it can be restated as identify genes strongly associated with the response or covariant of interest. The Gene Set Enrichment Analysis (GSEA) method is one method which focuses the analysis at the functional related gene sets level instead of single genes. It helps biologists to interpret the DNA microarray data by their previous biological knowledge of the genes in a gene set. GSEA has been shown to efficiently identify gene sets containing known disease-related genes in the real experiments. Here we want to evaluate the statistical power of this method by simulation studies. The results show that the the power of GSEA is good enough to identify the gene sets highly associated with the response or covariant of interest.

Page generated in 0.2737 seconds