UF researchers use AI to predict new coronavirus variants

Graphic illustration using an image of the SARS-CoV-2 virus in the background as captured with a transmission electron microscope.
Graphic illustration using an image of the SARS-CoV-2 virus in the background as captured with a transmission electron microscope. (Original TEM image by NIAID-RML, manipulated by Alexa Sauvagere)

A world awash in waves of novel coronavirus variants is left to react to each new emergence. Will this one be as deadly as delta? As transmissible as omicron?

Imagine if we could get ahead of the curve and predict the next one. That’s the goal of a new $3.7 million National Institutes of Health grant led by Marco Salemi, Ph.D., and Mattia Prosperi, Ph.D, University of Florida professors in the College of Medicine and College of Public Health and Health Professions.

“The coronavirus is a moving target and we have always been one step behind,” said Salemi, a professor of experimental pathology in the UF department of pathology, immunology and laboratory medicine and the Stephany W. Holloway Chair in AIDS Research. “Every time the epidemic seems to be coming under control, another variant emerges that is more virulent — not necessarily causing more severe disease, but certainly more transmissible — and it spreads again.”

The grant will use artificial intelligence, or AI, and machine learning to build an algorithm to spot new variants of concern.

Salemi is an expert in the molecular evolution of viruses, while Prosperi has expertise in applying AI to public health issues. Prosperi is a professor in the UF department of epidemiologySimone Marini, Ph.D., an assistant professor in epidemiology, will oversee the algorithm’s development.

 “We have been in an arms race with this virus,” said Marini. “But AI can help us to detect these genetic anomalies faster before they threaten public health.”

The team will use publicly available data from global repositories — where researchers upload genetic sequences of SARS-CoV-2 — to train the algorithm. They will design the algorithm to detect anomalies in new variants that signal they may be a worry for public health.

“We would know right away if a new variant can potentially take over the population rapidly,” said Salemi, a faculty member of the UF Emerging Pathogens Institute. “We would finally have an advantage over the virus and we could rapidly do prevention measures.”

In earlier work, the three researchers developed techniques to improve how datasets of SARS-CoV-2 genomic sequences are created to reduce possible biases. Prosperi and Marini, along with others, also have a study under review for an algorithm that correctly identified 11 out of 11 tested variants of concern 10 weeks before they were officially labeled as such by the Centers for Disease Control and Prevention. The crux of the new work lies in training the AI algorithm to detect the variants that will matter to human health, apart from all the background variants that exist due to natural evolutionary processes but that aren’t a public health threat.

“Imagine each variant that is sequenced and stored as a dot in a cloud of dots,” said Marini. “If it’s within the known area of the cloud, we are cool. But if this dot falls away from the known area, that is our variant, that is our concern. AI is learning the rules to put the dots in the right spots.”

In theory, the work could help produce a tool that would raise a red flag when a potential new variant of concern is uploaded to public databases. Scientists could then test the variant in a lab to see whether it attacks cells quickly or can resist antibodies.

AI can also be used to predict short- and long-term disease outcomes and the emergence of potentially drug-resistant mutations, according to Prosperi. The team will also use AI to possibly predict how the virus will evolve. 

The newly funded project builds on the past few years of work by Salemi and collaborators to understand the genetic variability of the novel coronavirus in Florida. With $250,000 in seed money from the UF Office of Research and UF Clinical and Translational Science Institute in fall 2019, Salemi’s lab began sequencing samples of the virus obtained from Alachua County residents.  

The team, known as the Florida Coronavirus Genomic Epidemiology Network, tracked waves of alpha, beta, delta, gamma and omicron variants moving through the state’s citizens. Samples are prepared for analysis in Salemi’s lab at the UF EPI and then brought next door to the UF Interdisciplinary Center for Biotechnology Research where the sequencing is performed.  

Soon after they began conducting genomic surveillance, Salemi’s team was awarded a grant from the AIDS Healthcare Foundation in Miami to sequence samples obtained from people coinfected with HIV in the Miami-Dade area. Work from both grants laid the foundation for the newly funded one.

“There has been this hypothesis in the scientific community that some of the most aggressive variants of concern, like the delta or the omicron variant, might have emerged in people with HIV,” said Salemi. “There have been reports of people infected for months, and one very recent case of a patient infected for more than 400 days.”

These extended infections are distinct from “long COVID,” the term for symptoms that linger long after the body clears the virus. When some people with HIV are also infected with SARS-CoV-2, the coronavirus replicates in their bodies for an extended period. This allows the virus to mutate, which helps it become ever better at evading the immune system. 

In the new work, the research team will sequence SARS-CoV-2 samples taken from the same HIV patients over time to yield a deeper understanding of how the virus changes. The research is also expected to improve care practices for people with HIV who also have COVID-19.

The team includes collaborators at the University of Miami, Drs. Maria Luisa Alcaide and Deborah Jones, and machine learning researchers and engineers at the University of Pavia, Italy.

By: DeLene Beeland