Dynamic Domains in the Spike Protein of SARS-COV-2 – Innovita Research

For many people, COVID-19 has highlighted the role of science in fighting disease around the world. From a young scientist's perspective, the thrill of figuring out something that no one has done before and producing answers that are very needed in society is an unmatched experience.

Creative rendition of SARS-COV-2 virus particles
Credits: National Institute of Allergy and Infectious Diseases, NIH

“Researching biomedical systems can be very complex and is always exciting. But knowing that your results may directly impact a major world issue makes this work even more thrilling and immensely gratifying,” said Genevieve Kunkel, a graduate student at the Tarakanova Lab at the University of Connecticut (UConn).

“I'm interested in using traditional, fundamental principles from physics and engineering to understand diseases.”

In Summer 2020, Kunkel, and her colleague Mohammed Madani, another graduate student in the lab, started gathering information on the SARS-CoV-2 virus under the direction of Professor Anna Tarakanova. Like everyone worldwide, they wanted to understand the virus as the new variants emerged – Alpha, Delta, Omicron.

The graduate students are co-authors of a recent paper in the Biophysical Journal. The other co-authors are Simon J. White, Paulo H. Verardi, and Anna Tarakanova. The research required a lot of supercomputing power, which the team relied on from the NSF-funded Extreme Science and Engineering Discovery Environment (XSEDE).

“We were trying to resolve the protein's dynamics in a fast, targeted way – gaining a better understanding of spike protein mutations that can be applied to more aggressive variants,” Kunkel said. “That's how we arrived at the process of ‘normal mode analysis,' which is the method we used to resolve protein dynamics, allowing us to identify dynamic domains – regions key to spike protein function. We also looked at resolving thermal stability and protein longevity. If we learn how to control these factors, we can provide insights for future vaccine design.”

“To evaluate the thermal stability of spike protein mutations in a fast and accurate manner, we also built a machine learning-based tool using the XSEDE-allocated Stampede2 supercomputer to train our model,” Madani said. “Access to the supercomputer was critical for our work. These types of simulations would not be possible on our local machines.”

Why it's important

The findings in this study allow the researchers to make recommendations about the design of future SARS-CoV-2 spike protein variants for effective immunogens that trigger neutralizing antibodies to hinder virus activity. The integrated computational approach they used can be applied to optimize vaccine design and predict the antibody responses by SARS-CoV-2 variants.

The researchers studied key regions associated with specific dynamic mechanisms – such as the movement of the receptor-binding domain (RBD), a key part of the virus located on the spike protein that allows it to dock to cells' receptors in the human body to enter these cells and initiate infection.

“The mechanisms we saw from the combined research of the project offered some insights into what types of mutations may be able to stabilize or destabilize certain regions of the spike protein to alter RBD motions so antibodies can recognize it,” Kunkel said. “This is important for identifying disease mechanisms in later variants or vaccine design.”

In addition, the methods used in this study – a combination of regular mode analysis (an approach to extract the most biologically relevant motions experienced by the molecules) and dynamic domain analysis – look at large numbers of different variants at one time.

“This was key to the research,” Kunkel said. “It's useful for the continual development of treatments because it allows researchers to quickly resolve and compare the different movements of many different spike protein variants, which is more essential now that we're dealing with many variants worldwide.”

How supercomputing helped

For this study, the researchers used XSEDE allocations on TACC's Stampede2 supercomputer and the centre's Ranch data storage system.

“When you're looking at 10-20 proteins or more, it's better to employ a supercomputer to accelerate the simulations,” Kunkel said. “Another component of the work, the thermal stability predictor, was developed using Stampede2 – this is a machine-learning predictor. We needed many core hours of computational power to train the model.”

The Ranch storage system was used to archive each protein they studied.

Anna Tarakanova, their professor, said: “I started using XSEDE about ten years ago as a graduate student at MIT. I've been using XSEDE continuously, first on my own, then with my students. Most of the work I do wouldn't be possible without XSEDE. It's been a hugely useful resource.”

Some of the key results from the study are as follows:

  • By comparing the dynamic signatures of different spike proteins, the researchers unearthed key differences between spike protein variants (both naturally occurring and engineered proteins used in vaccine design research). In identifying mutational effects on key functional regions, they began to understand how this research could be used to help create customized spike proteins for future immunogen design.
  • The researchers developed a comprehensive antigen map of the human body's immune response, linking how dynamic signatures of different spike protein variants coincide with key antibody binding regions. This will help scientists understand how effective a protein variant may be at neutralizing the virus. Up until this body of research, there were no resources with comprehensive antigenic binding information parsed out based on protein dynamics.

Tarakanova concluded: “The computational methods we used are transferable and can be applied more broadly – not just to SARS-CoV-2, but to any other types of viruses that may arise.”

Source: TACC