Return to search

Advances in Integrative Modeling for Proteins: Protein Loop Structure Prediction and NMR Chemical Shift Prediction

This thesis encompasses two studies on the application of computational techniques, including deep learning and physics-based methods, in the exploration of protein structure and dynamics.

In Chapter 1, I will introduce the background knowledge. Chapter 2 describes the development of a deep learning method for protein loop modeling. We introduce a fast and accurate method for protein loop structure modeling and refinement using deep learning. This method, which is both fast and accurate, integrates a protein language model, a graph neural network, and attention-based modules to predict all-atom protein loop structures from sequences. Its accuracy was validated on benchmark datasets CASP14 and CAMEO, showing performance comparable to or better than the state-of-the-art method, AlphaFold2.

The model’s robustness against loop structures outside of the training set was confirmed by testing on datasets after removing high-identity templates and train- ing set homologs. Moreover, it demonstrated significantly lower computational costs compared to existing methods. Application of this method in real-world scenarios included predicting anti- body complementarity-determining regions (CDR) loop structures and refining loop structures in inexact side-chain environments. The method achieved sub-angstrom or near-angstrom accuracy for most CDR loops and notably enhanced the quality of many suboptimal loop predictions in in- exact environments, marking an advancement in protein loop structure prediction and its practical applications.

Chapter 3 presents a collaborative study that employs nuclear magnetic resonance (NMR) experiments, molecular dynamics (MD), and hybrid quantum mechanics/molecular mechanics (QM/MM) calculations to investigate protein conformational dynamics across varying temperatures. NMR chemical shifts provide a sensitive probe of protein structure and dynamics. Prediction of shifts, and therefore interpretation of shifts, particularly for the frequently measured amidic 15N sites, remains a tall challenge.

We demonstrate that protein ¹⁵N chemical shift prediction from QM/MM predictions can be improved if conformational variation is included via MD sampling, focusing on the antibiotic target, E. coli Dihydrofolate reductase (DHFR). Variations of up to 25 ppm in predicted ¹⁵N chemical shifts are observed over the trajectory. For solution shifts, the average of fluctuations on the low picosecond timescale results in a superior prediction to a single optimal conformation. For low-temperature solid-state measurements, the histogram of predicted shifts for locally minimized snapshots with specific solvent arrangements sampled from the trajectory explains the heterogeneous linewidths; in other words, the conformations and associated solvent are ‘frozen out’ at low temperatures and result in inhomogeneously broadened NMR peaks. We identified conformational degrees of freedom that contribute to chemical shift variation. Backbone torsion angles show high amplitude fluctuations during the trajectory on the low picosecond timescale.

For a number of residues, including I60, 𝝍 varies by up to 60o within a conformational basin during the MD simulations, despite the fact that I60 (and other sites studied) are in a secondary structure element and remain well folded during the trajectory. Fluctuations in 𝝍 appear to be compensated by other degrees of freedom in the protein, including 𝝓 of the succeeding residue, resulting in “rocking” of the amide plane with changes in hydrogen bonding interactions. Good agreement for both room-temperature and low-temperature NMR spectra provides strong support for the specific approach to conformational averaging of computed chemical shifts.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/61m9-we15
Date January 2024
CreatorsZhang, Lichirui
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0036 seconds