1 |
From cancer gene expression to protein interaction: Interaction prediction, network reasoning and applications in pancreatic cancerDaw Elbait, Gihan Elsir Ahmed 10 July 2009 (has links) (PDF)
Microarray technologies enable scientists to identify co-expressed genes at large scale. However, the gene expression analysis does not show functional relationships between co-expressed genes. There is a demand for effective approaches to analyse gene expression data to enable biological discoveries that can lead to identification of markers or therapeutic targets of many diseases.
In cancer research, a number of gene expression screens have been carried out to identify genes differentially expressed in cancerous tissue such as Pancreatic Ductal Adenocarcinoma (PDAC). PDAC carries very poor prognosis, it eludes early detection and is characterised by its aggressiveness and resistance to currently available therapies. To identify molecular markers and suitable targets, there exist a research effort that maps differentially expressed genes to protein interactions to gain an understanding at systems level. Such interaction networks have a complex interconnected structure, whose the understanding of which is not a trivial task.
Several formal approaches use simulation to support the investigation of such networks. These approaches suffer from the missing knowledge concerning biological systems. Reasoning in the other hand has the advantage of dealing with incomplete and partial information of the network knowledge.
The initial approach adopted was to provide an algorithm that utilises a network-centric approach to pancreatic cancer, by re-constructing networks from known interactions and predicting novel protein interactions from structural templates. This method was applied to a data set of co-expressed PDAC genes. To this end, structural domains for the gene products are identified by using threading which is a 3D structure prediction technique. Next, the Protein Structure Interaction Database (SCOPPI), a database that classifies and annotates domain interactions derived from all known protein structures, is used to find templates of structurally interacting domains. Moreover, a network of related biological pathways for the PDAC data was constructed.
In order to reason over molecular networks that are affected by dysregulation of gene expression, BioRevise was implemented. It is a belief revision system where the inhibition behaviour of reactions is modelled using extended logic programming. The system computes a minimal set of enzymes whose malfunction explains the abnormal expression levels of observed metabolites or enzymes.
As a result of this research, two complementary approaches for the analysis of pancreatic cancer gene expression data are presented. Using the first approach, the pathways found to be largely affected in pancreatic cancer are signal transduction, actin cytoskeleton regulation, cell growth and cell communication. The analysis indicates that the alteration of the calcium pathway plays an important role in pancreas specific tumorigenesis. Furthermore, the structural prediction method reveals ~ 700 potential protein-protein interactions from the PDAC microarray data, among them, 81 novel interactions such as: serine/threonine kinase CDC2L1 interacting with cyclin-dependent kinase inhibitor CDKN3 and the tissue factor pathway inhibitor
2 (TFPI2) interacting with the transmembrane protease serine 4 (TMPRSS4). These resulting genes were further investigated and some were found to be potential therapeutic markers for PDAC. Since TMPRSS4 is involved in metastasis formation, it is hypothesised that the upregulation of TMPRSS4 and the downregulation of its predicted inhibitor TFPI2 plays an important role in this process. The predicted protein-protein network inspired the analysis of the data from two other perspectives. The resulting protein-protein interaction network highlighted the importance of the co-expression of KLK6 and KLK10 as prognostic factors for survival in PDAC
as well as the construction of a PDAC specific apoptosis pathway to study different effects of multiple gene silencing in order to reactivate apoptosis in PDAC.
Using the second approach, the behaviour of biological interaction networks using computational logic formalism was modelled, reasoning over the networks is enabled and the abnormal behaviour of its components is explained. The usability of the BioRevise system is demonstrated through two examples, a metabolic disorder disease and a deficiency in a pancreatic
cancer associated pathway. The system successfully identified the inhibition of the enzyme glucose-6-phosphatase as responsible for the Glycogen storage disease type I, which according to literature is known to be the main reason for this disease. Furthermore, BioRevise was used to model reaction inhibition in the Glycolysis pathway which is known to be affected by Pancreatic cancer.
|
Page generated in 0.0214 seconds