The new variants and the next phase of the pandemic

18 min readFeb 11, 2021

By Juan Cambeiro, Analyst at Metaculus

On 24 January 2020 — just over a year ago — Metaculus user @traviswfisher wrote an astoundingly prescient comment on a question asking whether the world population would increase every year between 2016 and 2025: “The Wuhan Coronavirus is looking like a pandemic event that could be serious enough to threaten this outcome.” Since then, the dedicated and insightful community of forecasters on Metaculus have made thousands of predictions on hundreds of questions on COVID-19, on everything from SARS-CoV-2 vaccine research and development, to when the first round of stay-at-home orders would end, and the current epidemiological trajectory of COVID-19.

While the questions asked about COVID have covered a wide variety of topics, the two perhaps most relevant to where we now find ourselves epidemiologically and virologically in the COVID-19 pandemic touch on the following broad subjects: (1) whether we could quickly and comprehensively eliminate SARS-CoV-2 on a global scale and (2) whether multiple safe and effective vaccines based on different platforms would be developed and rolled out. The answer to the former seems to be no (a median of another ~150 million confirmed cases are expected to occur before 2022), while the answer to the latter has been a resounding yes (multiple vaccine candidates have demonstrated high efficacy against disease and ten vaccines have been authorized or approved globally).

We are now entering a new phase of the pandemic where we have the knowledge that, while elimination of SARS-CoV-2 on a global scale is highly desirable, it is exceedingly unlikely — and, at the same time, we also now know it is only a matter of time before we can vaccinate everyone who wants to be inoculated with highly efficacious vaccines. However, the emergence of novel variants threatens to prolong the misery wrought by the pandemic in both the near future and the long term. The intent of this post is to pull together much of the important information on the new variants, forecast this next variant-laden phase of the pandemic, and suggest actionable steps that can be taken in response.

The new variants and the risks they pose

To some degree, all viruses undergo antigenic evolution. This means that antigens — the part of viruses that elicit human immune responses — change over multiple replication cycles due to mutations that occur during replication. RNA viruses like influenza are particularly likely to accumulate genetic changes rapidly, though SARS-CoV-2 was until recently thought to have a slower mutation rate than influenza due to its proofreading ability. Unfortunately, recent data indicate that SARS-CoV-2 is currently evolving at a pace even quicker than the fastest evolving seasonal influenza virus. Indeed, some new variants of concern have emerged with so many simultaneous mutations that these may constitute rare evolutionary jumps to “new fitness peaks” — the theoretical maximal capacity of a virus (a serotype, clade,or variant) to become dominant.

The new variants (that we know of) and our best forecasts about their impact

There are currently three variants (that we know of) that we have reason to be worried about. These variants of concern (VOCs) are B.1.1.7, B.1.351, and P.1. The New York Times has an excellent explanatory piece with lots of nice visuals of these VOCs and the mutations they possess. Another useful resource is the U.S. CDC’s “New Variants” page, which they frequently update with the latest information. All three VOCs can be tracked via their shared S:N501 (N501Y) mutation, which is a mutation that likely increases the ability of SARS-CoV-2 to bind to ACE2 receptors on human cells.

The New York Times

Taken together, these VOCs have informed thousands of Metaculus forecasts on a wide variety of questions that suggest we have a long journey still ahead of us. Some key informative forecasts are:

94% chance that a variant which is >30% transmissible variant will infect >10 million people before mid-2021

69% chance a variant which is >50% transmissible variant will infect >10 million people before mid-2021

54% probability that a new variant will significantly (>=50% difference in the attack rate in seropositive subjects) evade the immunity of previously infected people before 2022

96% chance Moderna, Pfizer/BioNTech, or Oxford/AstraZeneca will need to start producing an updated vaccine before 2023

70% chance the U.S. CDC will recommend before 2023 that previously vaccinated people be revaccinated to protect against new variant(s)

40% probability that the U.S. CDC will begin tracking at least one additional VOC as of 7 March 2021

median of 45% for the percentage of sequenced cases in the U.S. that are either B.1.1.7, B.1.351, and P.1 in the week of 1 March 2021

median of 79% for the percentage of sequenced cases in the U.S. that are either B.1.1.7, B.1.351, and P.1 in the week of 29 March 2021

Got all of that? Great, now let’s take a closer look at each of the VOCs (that we know about).


This is the VOC we currently have the most information on, and hence it is also the one we can most confidently worry about — indeed, it is the VOC most discussed in the comments sections of Metaculus questions on increased transmissibility of variants. B.1.1.7 was first detected in the United Kingdom in December 2020 and likely emerged in the late summer/early fall of 2020. It is distinguished by 17 mutations, including 8 in spike protein (the spike protein is the target of all antibodies elicited by current vaccines) — such a high number of mutations relative to preexisting variants is unusual and for this very reason the spread of B.1.1.7 in the UK has monitored very closely.

The New York Times

First, the good news: as of yet there is little indication that B.1.1.7 can substantially evade neutralization (blocking) of its spike protein by antibodies elicited by either prior infection or current vaccines. The studies that have confirmed this use neutralization assays — experiments that determine the extent to which antibodies induced by prior infection and/or current vaccines can neutralize the antigen in question (the spike protein of SARS-CoV-2). The interim analysis of Novavax’s phase III trial in the UK indicates that its vaccine is highly efficacious against B.1.1.7, with only a small drop-off in efficacy. However, there has recently been a worrying development it’s worth keeping an eye on: the emergence of a handful of B.1.1.7 samples in the UK that have the E484K mutation. This mutation is worrying because it is linked to immune escape — though it is too soon to say what will come of this given the small number of such samples that have been found.

Now, the bad news: We can state with a very high degree of confidence that B.1.1.7 is significantly more transmissible than preexisting variants. This is likely because of increased infectiousness of individuals via greater viral shedding. Two recent pre-prints have affirmed previous estimates from late 2020 that B.1.1.7 is somewhere on the order of 30–60% more transmissible. It’s not just epidemiological data from the UK that shows this — these high transmissibility estimates have been confirmed in the U.S., Switzerland, and Denmark. The experience of Denmark is particularly instructive since it arguably has the best surveillance and sequencing program in the world and because it does a lot of random sequencing — so we can be sure their estimates of B.1.1.7’s increased transmissibility are not based on sequencing being biased toward S gene dropouts (though this is a concern elsewhere — more on this later).

Spread of B.1.1.7 in the UK , as detected via the S gene dropout proxy

OK, so what does ~50% higher transmissibility translate to? Perhaps 50% more cases — and thus, 50% more deaths — having occurred by the end of the pandemic if this VOC were to become predominant everywhere in the near future? Unfortunately, the reality is worse than this: because infectious disease growth is exponential, increased transmissibility of 50% translates to a much greater degree of overall transmission than a 50% increase (unless, of course, the spread could be outrun by a rapidly-increasing percentage of the population having been vaccinated combined with nonpharmaceutical interventions to slow the spread of the VOC). The experience of the UK is instructive:

Daily COVID-19 cases in the UK, from Public Health England

B.1.1.7 took off in December even with a series of “circuit-breaker” stay-at-home orders in place, and case counts in the UK reached record highs in early January. Cases are now only falling thanks to an ongoing national stay-at-home order that has been in place since 5 January (as well as what might now be the early effects of an excellent vaccination campaign). This strict national stay-at-home has apparently been the only way to substantially bring down daily cases in the time since B.1.1.7 became predominant in December. But at the very least, the experience of the UK shows that even a B.1.1.7-driven pandemic can be controlled with strict nonpharmaceutical interventions to buy time for the pace of vaccinations to ramp up.

B.1.1.7 is now taking off elsewhere in Europe and the U.S., and it is only a matter of time before the same occurs in the rest of the world. A useful predictor of where it is likely to be — or probably already is — an issue is by considering October flight data between the UK and the rest of the world:

Air traffic from the UK by destination, October 2020. From cov-lineages.

A more obvious way to know when B.1.1.7 is an issue in a country is by tracking its frequency as a proportion of sequenced cases:

Frequency B.1.1.7 in sequences produced since first new variant reported per country. From cov-lineages.

But this approach is not likely to provide timely information, given the current issues with genomic sequencing (more on this later).


B.1.351 is a variant first identified in South Africa in December 2020 and the first detected sample is from October 2020. Like B.1.1.7, B.1.351 is characterized by multiple mutations — 21 in all, including 9 in the spike. Like B.1.1.7, it possesses the key N501Y mutation as well short sequence deletions. It also has an E484K mutation that has been increasingly linked to immune escape, though it is as of yet unclear if this is the primary reason B.1.351 is able to evade immunity to some degree. Concerningly, this variant seems to have been the driver of a recent surge in new infections and reinfections in South Africa.

The New York Times

Unfortunately, experiments using neutralization assays have shown that B.1.351 is substantially resistant to neutralization by antibodies elicited either by previous infections or current vaccines. At the moment, this is thought to be the main reason B.1.351 is able to spread so readily in populations that already have a high level of population immunity, as is the case in South Africa. But for now, higher transmissibility in of itself has not been established for B.1.351 — rather, B.1.351 is in effect more transmissible in populations with a high percentage of previously infected people because it can circumvent the positive effects of partial herd immunity to some degree.

In addition to experiments that use neutralization assays, we have some real-world data from the Johnson & Johnson, Novavax, and Oxford/AstraZeneca trials. It would be advisable to put greater weight on the data from Johnson & Johnson, given that it is based on 468 symptomatic cases while Novavax is only based on 62 and the Oxford/AstraZeneca data is based on 42.

The data from Johnson & Johnson’s trial indicates that its single-dose has a median efficacy of 57% in South Africa, where B.1.351 is dominant, as compared to a median efficacy of 72% in the U.S., where only a few cases of B.1.351 have been detected. This drop-off is worrying, but not catastrophic — moreover, hospitalizations and deaths were prevented to a similar extent in both locations. The interim data from Novavax is more worrying though: its vaccine demonstrated a median of 89% efficacy in the UK but just 49% in South Africa, though again note that this is based on a limited number of cases. Recent data from the Oxford/AstraZeneca trial is even more worrying yet, though it relies on a very small number of cases and so it is statistically underpowered. Here’s a handy table with this information (caveat: not all cases outside of South Africa are non-B.1.351 and not all cases in South Africa are B.1.351):

Juan Cambeiro, Metaculus

As for B.1.1.7, perhaps the best predictor of where B.1.351 is likely to already be circulating to a significant extent is to use flight data:

Air traffic from South Africa by destination, October 2020. From cov-lineages.

And, of course, a more obvious way to know when B.1.351 is an issue in a country is by tracking its frequency (one must keep in mind all the limitations of this approach — again, more on this later):

Frequency B.1.351 in sequences produced since first new variant reported per country. From cov-lineages.


P.1 is the VOC that we have the least information on. It was first identified in Manaus, Brazil in December 2020. It has 17 distinct mutations, including 10 in the spike protein. Like both B.1.1.7 and B.1.351, it has the N501Y mutation. Moreover, like B.1.351 it has the E484K mutation that has been linked to immune escape. However, little else is currently known about this variant other than the fact that it has been able to circulate to a high degree in Brazilian populations that have likely already had a high percentage of previously infected people. In this way, the concern about P.1 is similar to the concern about B.1.351 — their innate ability to transmit more readily has not been established, but it does appear as if they can escape preexisting immunity and immunity elicited by current vaccines.

The New York Times

Here is flight data from Manaus, Brazil for October 2020:

Air traffic from Manaus, Brazil by destination, October 2020. From cov-lineages.

And the frequency of P.1 sequences by country, with all the limitations this entails:

Frequency P.1 in sequences produced since first new variant reported per country. From cov-lineages.

Other variants that have yet to arise and/or be detected

The three VOCs we’ve discussed have demonstrated a remarkable degree of convergent evolution, which means that multiple mutations keep developing independently and appear to confer either transmissibility or immune-evasion advantages. In other words, there is selective pressure on SARS-CoV-2 lineages that results in these mutations becoming more commonplace.

Key mutations shared between B.1.1.7, B.1.351, and P.1. From Kristian Andersen

Thus it is not a matter of if, but rather of when other VOCs with greater transmissibility or immune escape will emerge. Indeed, it is probable that such variants are already here and are simply circulating undetected. To address this critical issue, we need much better genomic sequencing and data sharing.

Tracking the new variants, forecasting their spread, and predicting the emergence of new ones

Tracking new variants

The best way to track new variants is to do extensive genomic sequencing (determining the entire genome of SARS-CoV-2 samples) and to share this data in an accessible and timely manner.

Unfortunately, most of the world only sequences a small percentage of their cases:

The Washington Post

Moreover, much of the world does not do a great job at sharing their sequencing data. Below is a table of the top 22 countries when it comes to sequence data sharing, as per data from biostatistician Art Poon’s index:

Juan Cambeiro, Metaculus

Clearly, both the amount of sequencing and the extent of data sharing have thus far fallen short. The vast majority of countries with ongoing outbreaks sequence less than 10% of their confirmed cases, while the mean delay in days between sample collection and upload to GISAID, a global repository of sequence data, is between one and three months. Combined, these shortcomings make tracking the spread of the novel VOCs very difficult — both because the limited amount of sequencing leaves us blind to the spread of novel variants and the emergence of new ones, while the long delay in upload to GISAID makes real-time tracking and forecasting of the variants hard.

The experiences of two countries, Denmark and the U.S., are illuminating examples of the importance of sequencing and data sharing.

Denmark: Setting aside locations that have largely controlled the pandemic (Australia, New Zealand, Taiwan), Denmark does the most sequencing relative to its share of cases. It also shares this data in a timely and publicly accessible manner. As such, it is able to track the frequency of different variants over time, which informs the country’s policy response and enables it to understand how effective its nonpharmaceutical interventions are so that it can adjust (or stay on course) accordingly. Moreover, the fact that its sequencing program is random means its data is not biased toward certain sample types and is thus very reliable.

Frequency of B.1.1.7 as a proportion of sequenced cases as of 3 February, State Serum Institute

U.S.: While Denmark ranks first both in how much sequencing it does and how quickly and reliably it shares sequencing data, the U.S. ranks 43rd for the former and 21st for the latter. Moreover, the U.S. does not yet have a large random sequencing program (though a new program by the U.S. CDC, Helix and Illumina is now starting to do this). This means that U.S. sampling is almost certainly biased toward S gene dropout samples that are detectable via routine PCR testing since S gene dropout is sometimes an indication that a given sample may be B.1.1.7. This lack of random sequencing and timely data sharing means the U.S. cannot yet reliably track the frequency of new variants over time — rather, the U.S. has to for the time being mostly rely on raw case counts, which are likely both a huge underestimate of the actual number of new variant cases and a poor representation of how widespread the VOCs are.

Number of sequenced cases of B.1.1.7 as of 9 February, U.S. CDC

Forecasting the spread of variants and the emergence of new ones

Fortunately, the identification of new VOCs has prompted many countries to begin ramping up their sequencing and data sharing programs. Ideally, countries should follow the lead of Denmark. As sequencing increases and timely data sharing improves, open-access data repositories like GISAID become even more invaluable, as do websites like NextStrain and CoVariants that pull data from GISAID to track variants of new sequenced cases over time.

As sequencing ramps up and data sharing improves, better real-time forecasts can be made that can update frequently as new sequencing data comes in. This, in turn, will enable timely flagging of the emergence of new variants and/or of their rapid spread — which can inform crucial public health decision-making.

At the moment, most forecasts that predict future frequencies of novel variants rely on data that is several weeks old. While still very useful, this multiple-week lag is an obviously severe limitation of such forecasts. Nonetheless, one thing is clear from these forecasts: B.1.1.7, and hence S:N501Y, is likely to become predominant in much of continental Europe and the U.S. either later this month or in March.

Projected increase in frequency of B.1.1.7 in Denmark
Projected increase in frequency of B.1.1.7 in Switzerland
Projected increase in frequency of B.1.1.7 in the U.S.

Important steps to take for this new phase of the pandemic

So, you’ve now read all about the bad news regarding the VOCs, their epidemiological trajectories, and the probable emergence of more variants. What are some key actionable steps that can be taken in response? Here are some:

Testing, testing->sequencing pipeline, sequencing, data sharing

Testing: in order to sequence as many SARS-CoV-2 samples as possible, one must test as much as possible so as to capture as many actual infections possible. Rapid antigen testing, followed by confirmatory PCR testing if the rapid test is positive, seems like the best and most efficient way to rapidly scale up testing infrastructure.

Testing->sequencing pipeline: a surprisingly common impediment to doing sufficient sequencing seems to be establishing high-volume pipelines from labs that process diagnostic test samples to labs that can sequence those samples. It seems like the use of central, coordinated efforts and a small number of resources would go a long way in fixing this.

Sequencing: If possible, sequencing efforts should be as large as possible and randomly collect samples so as to mitigate any potential biases. Large random sequencing programs will necessitate considerable financial support and should probably be centralized national efforts.

Data sharing: once a sample is sequenced, timely upload to GISAID would help enable real-time tracking of novel variants, and thus, also accurate forecasting of their trajectories. Data sharing efforts should be supported, as should teams behind websites like NextStrain and CoVariants.


By epidemiologists: epidemiological groups have the requisite domain knowledge and experience to predict epidemiological trajectories of variants. Their efforts should be supported.

By virologists: virologists can help predict which mutations will enable SARS-CoV-2 to evade therapeutics and vaccines, and they can even help forecast the emergence of new mutations before they occur. Their efforts should be supported.

By forecasting communities: forecasting communities like Metaculus have established track records and can continue to help by providing up-to-date estimations of key metrics.

Collaborative prediction-making by subject-matter experts and forecasters: collaborative forecasting efforts like the Consensus Forecasting to Improve Public Health project by Metaculus, Professor Thomas McAndrew and the Computational Uncertainty Lab at Lehigh University involve both experts in infectious disease as well as the Metaculus forecasting community. This combined and collaborative approach holds much promise since it enables both subject-matter experts and trained forecasters to learn from one another so as to make the best possible forecasts.

Use of broad nonpharmaceutical interventions

Effective and low-cost measures: Universal masking (use of high-quality tightly-fitted masks can reduce viral exposure by 95%) would on its own go a long way toward lowering rates of transmission. However, this general intervention would not prevent more transmissible variants from increasing in frequency —rather, it would instead lower the overall amount of cases.

Stay-at-home orders: The experience of the UK shows that, in a worst-case-scenario, the use of strict nonpharmaceutical interventions like nationwide stay-at-home orders can bring outbreaks under control — even ones driven by novel variants. However, again: broad interventions would not prevent more transmissible variants from increasing in frequency — they would instead lower the overall amount of cases.

Use of targeted nonpharmaceutical interventions

Bidirectional contact tracing: focusing contact tracing efforts to cases of new variants would specifically help prevent the increase in frequency of these variants. Such contact tracing would ideally be both backwards and forwards and would involve the use of rapid testing as well as support for individuals to isolate/quarantine.

Surging resources to communities with the greatest number/frequency of VOCs: for instance, surging vaccines to communities where the variants are taking off most rapidly should be considered.


Massively scale up vaccine manufacturing: There are ongoing supply issues with vaccines, and recent projections indicate that many countries in the Global South may not be able to achieve widespread vaccination coverage until 2023. Moreover, it is not too late for a large scale-up of manufacturing to have a substantial impact on accelerating vaccine availability.

Accelerate vaccine distribution and administration: Israel is a case in point for how important and valuable it is to get shots in arms as fast as possible. It has led the way in vaccinating its population at a rapid pace, with the already-visible effect of cases and hospitalizations beginning to fall steeply in vaccinated groups. This occurred even in the context of widespread transmission of B.1.1.7.

Support COVAX: COVAX is a global facility and pooled procurement mechanism intended to equitably share vaccines across low-and-middle income countries. Given that variants will continue to frequently emerge unless we control the pandemic everywhere, meeting the ~$2 billion shortfall COVAX is likely to experience this year seems like an excellent investment.

Consider only administering one mRNA vaccine dose to previously infected people: Three recent pre-prints suggest that a single dose of an mRNA vaccine is very likely sufficient to induce a high level of immunity in previously infected people. Adopting this approach would free up millions of vaccine doses to vaccinate more people sooner.

Continuously update vaccines and develop boosters against new variants: Metaculus forecasts indicate that this will very likely need to occur in the near future. Fortunately, Moderna and other companies are already planning to do this. The regulatory process for authorizing/approving such boosters should be streamlined.

Develop a “variant-proof” vaccine: The development of a vaccine that can induce broadly neutralizing antibodies (bnAbs) for the Sarbecovirus lineage of coronaviruses would not only protect vaccinated individuals against variants of SARS-CoV-2, but could also help prevent future coronavirus pandemics.

There is still time to shape the global trajectory of this pandemic. Let’s get things right for this new variant-laden phase of it.

Follow Juan Cambeiro on Twitter @juan_cambeiro and Metaculus @metaculus