AI assists mathematicians in recognizing new COVID-19 variants

Significant SARS-CoV-2 lineages identified via scalable ML techniques.


Researchers at The Universities of Manchester and Oxford have made a smart computer system that can detect and monitor new and worrying COVID-19 types. This system could also be handy for dealing with other illnesses later on.

The software combines advanced techniques for reducing large amounts of data and a novel approach to categorizing objects called CLASSIX, which Manchester University mathematics experts developed. This aids in identifying viral blueprint groups that could potentially cause issues down the road.

Their study, published in the journal PNAS this week, might help improve the usual methods of watching how viruses change, like examining their family trees, which usually requires extensive manual work.

Roberto Cahuantzi, a researcher at The University of Manchester and the first and corresponding author of the paper, said, “Scientists are now intensifying efforts to pinpoint these worrying new variants, such as alpha, delta, and omicron, at the earliest stages of their emergence. Suppose we can find a way to do this quickly and efficiently. In that case, it will enable us to be more proactive in our response, such as tailored vaccine development. It may even enable us to eliminate the variants before they become established.”

COVID-19 changes fast because it mutates and spreads quickly. Finding new problematic strains requires a lot of work. Currently, the GISAID database has around 16 million COVID-19 blueprints. Keeping track of how these blueprints evolve requires tremendous computer and human labor.

This new method automates tasks for analyzing genetic data. Using a standard laptop, researchers processed millions of sequences in just one or two days. This quick analysis is possible because of reduced resource needs compared to existing methods. 

The University of Manchester says this approach helps manage the vast genetic data from the pandemic. It doesn’t replace human work but makes it faster, freeing up experts for other essential tasks. Using machine learning, the method breaks down virus genetic sequences into small “words” and groups similar ones together.

Professor of mathematics at the University of Manchester, Stefan Güttel, claimed that “their clustering algorithm, CLASSIX, is less taxing on computers than more conventional techniques. It also offers concise justifications for the clusters it discovers.”

Roberto Cahuantzi added that their research proves that machine learning can be used to spot new important virus types early on. This method doesn’t need to create family trees like traditional methods. While family trees are still the best way to understand where a virus comes from, machine learning can handle way more data without needing a lot of computer power.

Using AI, especially the CLASSIX algorithm, is an excellent way to find new COVID-19 types quickly. It helps scientists examine genetic data faster and easier, which allows health workers to monitor the pandemic and react to any changes faster.

Journal reference:

  1. Roberto Cahuantzi, Katrina A. Lythgoe et al., Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods. PNAS. DOI: 10.1073/pnas.2317284121.
- Advertisement -

Latest Updates