Return to search

Machine Learning Approaches Towards Protein Structure and Function Prediction

<div>
<div>
<div>
<p>Proteins are drivers of almost all biological processes in the cell. The functions of a protein
are dependent on their three-dimensional structure and elucidating the structure and function of
proteins is key to understanding how a biological system operates. In this research, we developed
computational methods using machine learning techniques to predicts the structure and function
of proteins. Protein 3D structure prediction has advanced significantly in recent years, largely due
to deep learning approaches that predict inter-residue contacts and, more recently, distances using
multiple sequence alignments (MSAs). The performance of these models depends on the number
of similar protein sequences to the query protein, wherein some cases similar sequences are few
but dissimilar sequences with local similarities are more and can be helpful. We have developed a
novel deep learning-based approach AttentiveDist which further improves over the previous state
of art. We added an attention mechanism where dis-similar sequences are also used (increasing
number of sequences) and the model itself determines which information from such sequences it
should attend to. We showed that the improvement of distance predictions was successfully
transferred to achieve better protein tertiary structure modeling. We also show that structure
prediction from a predicted distance map can be further enhanced by using predicted inter-residue
sidechain center distances and main-chain hydrogen-bonds. Protein function prediction is another
avenue we explored where we want to predict the function that a protein will perform. The crux of
the approach is to predict the function of protein based on the function of similar sequences. Here,
we developed a method where we use dissimilar sequences to extract additional information and
improve performance over the previous approaches. We used phylogenetic analysis to determine
if a dissimilar sequence can be close to the query sequence and thus can provide functional
information. Our method was ranked highly in worldwide protein function prediction competition CAFA3 (2016-2019). Further, we expanded the method with a neural network to predict protein
toxicity that can be used as a safety check for human-designed protein sequences.</p></div></div></div>

  1. 10.25394/pgs.14744343.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/14744343
Date04 August 2021
CreatorsAashish Jain (10933737)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/thesis/Machine_Learning_Approaches_Towards_Protein_Structure_and_Function_Prediction/14744343

Page generated in 0.002 seconds