Return to search

Clustering approaches for extracting structural determinants of enzyme active sites

The study of enzyme binding sites is an essential but rather demanding process of increased complexity since the amino acids lining these areas are not rigid. At the same time, the minimization of side effects and the specificity of new ligands is a great challenge in the structure-based drug design approach. Using glycogen phosphorylase - a validated target for the development of new antidiabetic agents - as a case study, this project focuses on the examination of side-chain conformations of amino acids that play a key role in the catalytic site of the enzyme. Specifically, different rotamers of each amino acid were collected to build a dataset of different conformations of the catalytic site. The rotamers were filtered by their probability of occurrence and subsequently, all rotamers that create steric clashes were rejected. Then, these conformations were clustered based on their similarity. Three different clustering algorithms and multiple numbers of clusters were tested using the silhouette scores evaluation for the clustering process. In order to measure the similarity, the Euclidean metric was used which due to the correspondence of the coordinates between the conformations was very similar to the cRMSD metric. Two-level clustering was applied to the dataset for more in-depth observations. According to the clustering results, specific aminoacids with major geometrical variations in their rotamers play the most important role in the separation of the clusters. Additionally, all rotamers of an amino acid can be grouped based on their structure, something that was confirmed using “Chimera” software as a visualization tool. To this end, the ultimate aim of this study is to examine whether the clustering of conformations produces clusters with points geometrically similar to each other, in order to identify near neighbors, i.e. conformations that are quite similar in structure but do not play a determinant role in the function and those that are quite diverse and could be further exploited.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-426221
Date January 2020
CreatorsStamatelou, Ismini - Christina
PublisherUppsala universitet, Institutionen för biologisk grundutbildning
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.1058 seconds