Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, rise time, settling time and peak time are generated by the 2nd order control system for the clustering purpose. This feature-based approach is much faster and more accurate than the existing profile-based algorithm for long time-series data.
Identifer | oai:union.ndltd.org:unt.edu/info:ark/67531/metadc177269 |
Date | 08 1900 |
Creators | Zhang, Guilin |
Contributors | Dong, Qunfeng, Wan, Yan, Gao, Xiang |
Publisher | University of North Texas |
Source Sets | University of North Texas |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Format | Text |
Rights | Public, Zhang, Guilin, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved. |
Page generated in 0.0021 seconds