Return to search

Unravelling biological processes using graph theoretical algorithms and probabilistic models

This thesis develops computational methods that can provide insights into the behaviour of biomolecular processes. The methods extract a simplified representation/model from samples characterising the profiles of different biomolecular functional units. The simplified representation helps us gain a better understanding of the relations between the functional units or between the samples. The proposed computational methods integrate graph theoretical algorithms and probabilistic models. Firstly, we were interested in finding proteins that have a similar role in the transcription cycle. We performed a clustering analysis on an experimental dataset using a graph partitioning algorithm. We found groups of proteins associated with different stages of the transcription cycle. Furthermore, we estimated a network model describing the relations between the clusters and identified proteins that are representative for a cluster or for the relation between two clusters. Secondly, we proposed a computational framework that unravels the structure of a biological process from high-dimensional samples characterising different stages of the process. The framework integrates a feature selection procedure and a feature extraction algorithm in order to extract a low-dimensional projection of the high-dimensional samples. We analysed two microarray datasets characterising different cell types part of the blood system and found that the extracted representations capture the structure of the hematopoietic stem cell differentiation process. Furthermore, we showed that the low-dimensional projections can be used as a basis for analysis of gene expression patterns. Finally, we introduced the geometric hidden Markov model (GHMM), a probabilistic model for multivariate time series data. The GHMM assumes that the time series lie on a noisy low-dimensional manifold and infers a dynamical model that reflects the low-dimensional geometry. We analysed multivariate time series data generated with a stochastic model of a biomolecular circuit and showed that the estimated GHMM captures the oscillatory behaviour of the circuit.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:705768
Date January 2014
CreatorsVangelov, Borislav
ContributorsBarahona, Mauricio
PublisherImperial College London
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/10044/1/44521

Page generated in 0.1931 seconds