Predicting the pandemic: COVID-19 modeling rights and wrongs


Updated April 28, 2021

In the spring of 2020, shortly after the COVID-19 pandemic began surging, Xcenda's Ken O'Day, PhD, MPA, Senior Director of Global Health Economics, developed an infectious disease model and white paper* to examine the potential impact of both non-pharmaceutical and pharmaceutical interventions on mitigating the effects of the pandemic. One year later, Mike Eaddy, PharmD, PhD, Vice President of Scientific Consulting at Xcenda, asked Dr. O'Day for an update in the interview below. 
*The full text of the white paper can be found here.

Mike Eaddy: As a health economist, how did you come to develop a COVID-19 model and white paper?
Ken O'Day: The first COVID cases and deaths in the United States (US) were in my home state of Washington and after seeing what happened in Wuhan, China, I was very concerned and wanted to better understand what could happen in the US. Being a health economist who does a lot of modeling work, I was also interested in learning more about dynamic transmission models. These models use differential equations to capture disease transmission by moving patients between various compartments. They are similar to the Markov models we use to model chronic disease progression, with the primary difference being the increased volatility due to the possibility that interaction between infected and susceptible people can lead to exponential growth. After immersing myself in the literature and methods, I built a susceptible-exposed-infected-recovered (SEIR) Excel-based model. At that time in March 2020, nothing was really available in terms of COVID models, until the Imperial College model was released on March 16, followed by the Institute for Health Metrics (IHME) model at the end of March.

Eaddy: How does Xcenda’s COVID-19 model differ from other COVID-19 models?
O'Day: Most COVID models have been developed by academic, industry, and research groups. Many have teams of epidemiologists and infectious disease modeling experts employing a variety of sophisticated methods including SEIR modeling, regression modeling, Bayesian methods, and machine learning. They also have the resources to provide regular forecasts over the duration of the pandemic, to conduct probabilistic analyses, and to develop forecasts across a wide variety of geographic regions (eg, states, counties). Xcenda’s model was intended as a one-off endeavor to make estimates for the US for the rest of 2020 as opposed to the customary projections of only a few weeks at a time. However, the goal was primarily educational—to help people see how the pandemic might progress, to better understand the potential impact of various pharmaceutical and non-pharmaceutical interventions available to policymakers, and to present a COVID model in a transparent and accessible way. More specifically, our team looked at a variety of possible scenarios to try to show what might happen given various assumptions about levels of disease transmission, changes in health system capacity, timing and degree of social distancing, easing of social distancing measures, and potential drug interventions. So, more of a 30,000-foot view to help people see the potential overall impact than a focus on short-term predictions.

Eaddy: What was the biggest challenge developing your COVID-19 model?
O'Day: Getting good quality data was definitely the biggest challenge. Early in the pandemic, there was very little COVID data out there, and much of that data was of questionable quality. In March/April 2020, we had the number of cases and deaths from the Johns Hopkins Coronavirus Resource Center, but even these had significant limitations at the time as testing was very limited in the US, meaning cases were being underestimated and not all COVID deaths were being captured in the counts. In fact, the day after we published our white paper, New York State adjusted their death count upward by 3,000 deaths, rendering our projections underestimates. Good estimates for other critical parameters were even harder to come by—including things like the reproduction rate, infection fatality rate, and hospitalization rates. Now, while there are still challenges with quality, the data have gotten a lot better, and the bigger challenge is coping with the overwhelming amount of COVID data available.

Eaddy: Now that we are a year into the pandemic, looking back, how did your COVID-19 model predictions work out?
O'Day: Our model was on point in predicting the early course of the pandemic in terms of the number of US deaths—a grim source of validation, unfortunately. We finalized our analysis on April 15, 2020, and released our white paper on April 21. Our base case estimates tracked much closer to the actual deaths than those from the frequently cited IHME model (Figure 1). It wasn’t until after the second surge, following the July 4th holiday and the easing of social distancing measures, that our estimates substantially diverged from the actual figures. Given the volatility of infectious disease transmission, it is customary for epidemiologic models to only offer predictions for a few weeks out at a time. The IHME forecast was for a few months, while we made our forecasts for the entire year of 2020. Our base case analysis, which estimated about 140K deaths by the end of 2020, which assumed social distancing measures remained in place, was significantly off, as we nearly reached 350K deaths due to the summer and fall/winter surges.

Figure 1. Cumulative COVID-19 deaths and daily cases (US, 2020)

Cumulative COVID-19 deaths and daily cases (US, 2020)

Hospitalizations are more challenging to predict. Our model estimated a peak of 102K hospitalizations and 27K intensive care unit (ICU) patients during the April 2020 surge. The COVID Tracking Project reported about 60K were hospitalized during the April peak, so our predictions were overestimates. For purposes of comparison, in December 2020 there were a total of 132K hospitalizations and 30K ICU patients at the peak.

Eaddy: How did COVID-19 models in general perform over the past year? 
O'Day: It has been mixed. The IHME model consistently underestimated the severity of the pandemic early on, which may be why the previous administration relied on it and which led to its estimates being frequently reported in the news media. The problem with the original IHME model was that instead of modeling actual dynamic disease transmission, it attempted to fit the observed Wuhan infection and mortality curves to a US environment, which is fraught with significant limitations. They have since changed over to a SEIR model with improved results. A recent study by the COVID-19 Forecast Hub looked at the performance of 23 models and had some interesting findings. Overall, the top performing model was the COVID-19 Forecast Hub ensemble model—a model that averages the estimates of all the individual models. This reduces some of the bias inherent in individual models and improves overall accuracy, similar to the way a meta-analysis of studies can provide a better estimate of an overall treatment effect. A simple baseline model which predicted incidence of deaths based on the number of reported deaths in the most recent week was the median performer—not unlike an index fund of COVID models. Not surprisingly, shorter-term projections were more accurate than longer-term projections, with 1-week predictions having about half the error of 4-week predictions. Only 2 of the models made predictions that were roughly consistent with their 95% prediction intervals, the probabilistic predictions of the rest underestimated uncertainty, particularly during the November/December surge. Models performed worse during periods of change (eg, increasing deaths and peaks) and simpler models with fewer data inputs were some of the most accurate. 

Eaddy: What lessons did you learn from this modeling experience? 
O'Day: I think it’s helped me to become a better modeler. It was a humbling experience and teaches you that things don’t always work out as you (or others) expect. In health economics, we don’t often get an opportunity to validate our model projections against reality. Modeling COVID allows you to test your model forecasts against cold hard reality as the COVID-19 Forecast Hub study shows. I think this study also highlights the potential benefits of simpler models—health economic models, in my opinion, are too often needlessly complex—and that we should also be more cautious about interpreting results from probabilistic sensitivity analyses, which likely underestimate uncertainty. I learned a lot about epidemiology and how accounting for human behavior can be especially difficult in infectious disease models. Fortunately, in developing health economic models, we are often modeling diseases and health care systems where there are more available data and predictability than in a pandemic where you have a new unknown disease and the unpredictability of human behavior. Nate Silver, an American statistician who is well-known for his election predictions, said that developing a COVID forecast model is much more challenging than forecasting elections, but as we know those can be quite difficult too.

Eaddy: What are some of the important issues facing us now that COVID-19 models can help us with? 
O'Day: Far and away, the most important thing in the near term that models can help with is to better understand the potential impact of the new variants. People don’t really appreciate how even small changes in infectiousness can have a huge impact on the severity of the pandemic due to the potential for exponential growth. As we better understand the impact of various social distancing measures, models may also be useful to identify strategies to minimize COVID impact as people begin to resume more of their pre-pandemic activities. However, the world won’t be the same place for quite some time as COVID has become endemic. Models will therefore be useful to forecast the potential for outbreaks based on vaccination effectiveness and coverage, spread of new mutations, and waning immunity over time—similar to the way we now look at influenza. Models can also be used to forecast and inform whether we will achieve herd immunity. Everyone has basically assumed that we will get there, but unfortunately that’s far from certain.

Eaddy: Modelers are in the business of making forecasts and sometimes things don’t work out as you expect; what are some of the things that surprised you about the COVID-19 pandemic? 
O'Day: A very positive surprise was the quick vaccine development timeline, the virtual across-the-board success of the vaccines, and their high rates of effectiveness, although the variants may pose a significant challenge to vaccines. However, the lack of progress on effective treatments has been a bit of a disappointing surprise. While there are definitely some positive treatment advances (eg, dexamethasone), at this point in the pandemic I would have expected more progress to have been made. But perhaps the biggest surprise was the politicization of social distancing, mask usage, and vaccination and how much that has undermined efforts to control the pandemic which have made things much worse in the US. Usually, in natural disasters, people come together and put politics aside; but I think the length of the pandemic, the impact on the economy, and a highly charged political environment led to this unfortunate state of affairs. Related to this, I was also surprised by the poor quality of the early pandemic response in the US. Coordination and planning were lacking, and testing was a complete disaster at first and is still not being fully and appropriately utilized. Finally, while we noted the possibility of COVID mutations in our white paper, I was not expecting the development of multiple significant variants that we have seen, as the consensus view early in the pandemic seemed to be that coronaviruses do not mutate very quickly. 

Eaddy: Conversely, what are some of the things that worked out as you expected? 
O'Day: I’m not surprised by the duration and extent of the pandemic—we figured that this was going to be a long-term challenge that would change life as we know it for years, even though in Spring 2020 some were saying this would be over by the summer or the end of the year. I’m not surprised by the pandemic fatigue that has made it challenging to maintain social distancing measures—we all want to go back to life as normal. We repeatedly emphasized in our white paper that social distancing was the principal lever available to policymakers and I think the way the pandemic has progressed has demonstrated this to be mostly correct as government policies and human behavior have been the primary determinants of the severity of the pandemic. The surges and waves that we’ve witnessed across various geographic regions can be largely explained by differences in policies and behavior—mask wearing, travel reductions, isolation and quarantine, limiting indoor gatherings, etc. But there is definitely a lot we still have to learn about this disease and the effectiveness of different interventions before we can finally bring the pandemic under control. Accurate modeling can hopefully continue to play an important role here.



The article should be referenced as follows: 

Predicting the pandemic: COVID-19 modeling rights and wrongs. HTA Quarterly. Summer 2021.



  • Cramer EY, et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the US.
  • Thompson MG, Burgess JL, Naleway AL, et al. Interim estimates of vaccine effectiveness of BNT162b2 and mRNA-1273 COVID-19 vaccines in preventing SARS-CoV-2 infection among health care personnel, first responders, and other essential and frontline workers — eight U.S. locations, December 2020–March 2021. MMWR Morb Mortal Wkly Rep. 2021;70:495-500. doi: