Using Machine Learning to Develop Personalized Vaccines for Cancer

Yale researchers have developed a machine learning model, called Immunostruct, that can help scientists create more personalized vaccines, including cancer vaccines. They described the tool in Nature Machine Intelligence along with findings from applying it to cancer and immunology data.

Blood samples - illustrative photo.

Blood samples – illustrative photo. Image credit: Karolina Kaboompics via Pexels, free license

When a potential threat, such as a virus or tumor, arises in our body, our immune cells recognize peptides—essentially short proteins—on the surface of the invader and mount a defensive response. This small region that the immune system interacts with is known as an epitope.

Epitope-based vaccines are an emerging technology that contain specific peptides in order to trigger immune responses that precisely target particular diseases. Ongoing studies show that these vaccines are a promising potential immunotherapy for a range of cancers, including melanomas, breast cancers, and glioblastomas. Researchers are also investigating whether these vaccines could more effectively combat new variants of infectious diseases.

To develop these vaccines, scientists can use models that help them predict which peptides are most likely to trigger a strong immune response to a particular antigen. A limitation of many of these models, the researchers say, is that that they treat peptides as a one-dimensional sequence of amino acids, not the three-dimensional, active structures that they are.

Now, Yale researchers have created a model that also incorporates structural and biochemical properties of peptides. In the new studythey show that the multimodal model is more effective at identifying peptide candidates than its predecessors.

“Cancer is extremely heterogeneous—which often makes it very hard to treat effectively,” says Kevin B. Givechian, PhD, an MD-PhD student at Yale and co-first author on the study. “We have built a deep-learning model that integrates more information than had previously been combined to help us improve the identification of vaccine targets that stimulate people’s immune system against their own tumor. Doing so would enable a more effective and less toxic method of treatment.”

The Immunostruct model is available open source via GitHub.

More complete peptide data boosts model’s performance

Previous prediction models encode amino acid sequences as text, explains Chen Liu, a PhD candidate in computer science and co-first author. “But then they disregarded all this rich information in the 3D space,” he says. “We tried to encode structural and biochemical properties to bring in more information that was previously overlooked.”

In the study, the researchers jointly trained Immunostruct on amino acid information, structure data, and biochemical properties. They found that each of these components synergistically improved the model’s performance.

“Integrating all of this information is useful and important for understanding immunogenicity, or the ability to provoke an immune response,” says co-senior author Smita Krishnaswamy, PhD, associate professor of genetics at YSM and of computer science at Yale Engineering and member of Yale Cancer Center.

“The model could help scientists tailor patient-specific therapeutics by finding the right epitope for the patients more easily and accurately,” says co-senior author Akiko Iwasaki, PhD, Sterling Professor of Immunobiology at YSM and member of Yale Cancer Center. While chemotherapy, for example, can quickly kill rapidly dividing cancer cells, it also harms healthy cells. The ability to identify epitopes associated with a patient’s unique pathology could lead to therapeutics that directly target their disease.

The researchers recently licensed their model to Latent-Alpha, a spinout company from Yale. “We wanted to get our model into people’s hands and to be used for vaccine design,” says Krishnaswamy.

Source: Yale University