Global ETD Search

Return to search

Optimization Techniques for Protein-Protein Co-Regulation and Interaction Prediction

The availability of large gene expression microarray data has brought along many challenges for biological data mining. Many different clustering methods have been proposed and widely used to analyze gene expression data. The underlying concept allows to identify sets of genes sharing similar expression patterns across subsets of samples, and its usefulness has been demonstrated for different organisms and data sets. Currently, there are several biclustering methods that use different techniques; however, it is not clear how to compare the resulted biclusters with respect to biological relevance. So far, there are no available guidelines for choosing a biclustering technique from available ones. In this work, we propose two new Mean Squared Residue (MSR) based biclustering methods. The first method is a dual biclustering algorithm which finds a set of biclusters using a greedy approach. The second method combines dual biclustering algorithm with quadratic programming. The dual biclustering algorithm reduces the size of the matrix, so that the quadratic program can find an optimal bicluster reasonably fast. We also describe the comparison method, explain how we handle bicluster’s overlap and how we treat missing data.

Mean Squared Residue

Biclustering

Computer Sciences

Identifer	oai:union.ndltd.org:GEORGIA/oai:digitalarchive.gsu.edu:cs_diss-1042
Date	01 December 2009
Creators	Gremalschi, Stefan
Publisher	Digital Archive @ GSU
Source Sets	Georgia State University
Detected Language	English
Type	text
Format	application/pdf
Source	Computer Science Dissertations

Page generated in 0.0015 seconds

Optimization Techniques for Protein-Protein Co-Regulation and Interaction Prediction

Description

Links & Downloads

Tags

Additional Fields