1 |
GAGS : A Novel Microarray Gene Selection Algorithm for Gene Expression ClassificationWu, Kuo-yi 30 July 2010 (has links)
In this thesis, we have proposed a novel microarray gene selection algorithm consisting of five processes for solving gene expression classification problem. A normalization process is first used to remove the differences among different scales of genes. Second, an efficient gene ranking process is proposed to filter out the unrelated genes. Then, the genetic algorithm is adopted to find the informative gene subsets for each class. For each class, these informative gene subsets are adopted to classify the testing dataset separately. Finally, the separated classification results are fused to one final classification result.
In the first experiment, 4 microarray datasets are used to verify the performance of the proposed algorithm. The experiment is conducted using the leave-one-out-cross-validation (LOOCV) resampling method. We compared the proposed algorithm with twenty one existing methods. The proposed algorithm obtains three wins in four datasets, and the accuracies of three datasets all reach 100%. In the second experiment, 9 microarray datasets are used to verify the proposed algorithm. The experiment is conducted using 50% VS 50% resampling method. Our proposed algorithm obtains eight wins among nine datasets for all competing methods.
|
2 |
A Neuro-Fuzzy Approach for Functional Genomics Data Interpretation and AnalysisNeagu, Daniel, Palade, V. January 2003 (has links)
No
|
3 |
Geometric algorithms for component analysis with a view to gene expression data analysisJournée, Michel 04 June 2009 (has links)
The research reported in this thesis addresses the problem of component analysis, which aims at reducing large data to lower dimensions, to reveal the essential structure of the data. This problem is encountered in almost all areas of science - from physics and biology to finance, economics and psychometrics - where large data sets need to be analyzed.
Several paradigms for component analysis are considered, e.g., principal component analysis, independent component analysis and sparse principal component analysis, which are naturally formulated as an optimization problem subject to constraints that endow the problem with a well-characterized matrix manifold structure. Component analysis is so cast in the realm of optimization on matrix manifolds. Algorithms for component analysis are subsequently derived that take advantage of the geometrical structure of the problem.
When formalizing component analysis into an optimization framework, three main classes of problems are encountered, for which methods are proposed. We first consider the problem of optimizing a smooth function on the set of n-by-p real matrices with orthonormal columns. Then, a method is proposed to maximize a convex function on a compact manifold, which generalizes to this context the well-known power method that computes the dominant eigenvector of a matrix. Finally, we address the issue of solving problems defined in terms of large positive semidefinite matrices in a numerically efficient manner by using low-rank approximations of such matrices.
The efficiency of the proposed algorithms for component analysis is evaluated on the analysis of gene expression data related to breast cancer, which encode the expression levels of thousands of genes gained from experiments on hundreds of cancerous cells. Such data provide a snapshot of the biological processes that occur in tumor cells and offer huge opportunities for an improved understanding of cancer. Thanks to an original framework to evaluate the biological significance of a set of components, well-known but also novel knowledge is inferred about the biological processes that underlie breast cancer.
Hence, to summarize the thesis in one sentence: We adopt a geometric point of view to propose optimization algorithms performing component analysis, which, applied on large gene expression data, enable to reveal novel biological knowledge.
|
Page generated in 0.1094 seconds