Global ETD Search

Return to search

APPLY DATA CLUSTERING TO GENE EXPRESSION DATA

Data clustering plays an important role in effective analysis of gene expression. Although DNA microarray technology facilitates expression monitoring, several challenges arise when dealing with gene expression datasets. Some of these challenges are the enormous number of genes, the dimensionality of the data, and the change of data over time. The genetic groups which are biologically interlinked can be identified through clustering. This project aims to clarify the steps to apply clustering analysis of genes involved in a published dataset. The methodology for this project includes the selection of the dataset representation, the selection of gene datasets, Similarity Matrix Selection, the selection of clustering algorithm, and analysis tool. R language with the focus of Kmeans, fpc, hclust, and heatmap3 packages in R is used in this project as an analysis tool. Different clustering algorithms are used on Spellman dataset to illustrate how genes are grouped together in clusters which help to understand our genetic behaviors.

Identifer	oai:union.ndltd.org:csusb.edu/oai:scholarworks.lib.csusb.edu:etd-1293
Date	01 December 2015
Creators	Abualhamayl, Abdullah Jameel, Mr.
Publisher	CSUSB ScholarWorks
Source Sets	California State University San Bernardino
Detected Language	English
Type	text
Format	application/pdf
Source	Electronic Theses, Projects, and Dissertations

Page generated in 0.0013 seconds

APPLY DATA CLUSTERING TO GENE EXPRESSION DATA

Description

Links & Downloads

Tags

Additional Fields