Return to search

Comparing Performance of Gene Set Test Methods Using Biologically Relevant Simulated Data

Today we know that there are many genetically driven diseases and health conditions.These problems often manifest only when a set of genes are either active or inactive. Recent technology allows us to measure the activity level of genes in cells, which we call gene expression. It is of great interest to society to be able to statistically compare the gene expression of a large number of genes between two or more groups. For example, we may want to compare the gene expression of a group of cancer patients with a group of non-cancer patients to better understand the genetic causes of that particular cancer. Understanding these genetic causes could potentially lead to improved treatment options.
Initially, gene expression was tested on a per gene level for statistical difference. In more recent years, it has been determined that grouping genes together by biological processes into gene sets and comparing groups at the gene set level probably makes more sense biologically. A number of gene set test methods have since been developed. It is critically important that we know if these gene set test methods are accurate.
In this research, we compare the accuracy of a group of popular gene set test methods across a range of biologically realistic scenarios. In order to measure accuracy, we need to know whether each gene set is differentially expressed or not. Since this is not possible in real gene expression data, we use simulated data. We develop a simulation framework that generates gene expression data that is representative of actual gene expression data and use it to test each gene set method over a range of biologically relevant scenarios. We then compare the power and false discovery rate of each method across these scenarios.

Identiferoai:union.ndltd.org:UTAHS/oai:digitalcommons.usu.edu:etd-8495
Date01 December 2018
CreatorsLambert, Richard M.
PublisherDigitalCommons@USU
Source SetsUtah State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceAll Graduate Theses and Dissertations
RightsCopyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact digitalcommons@usu.edu.

Page generated in 0.0025 seconds