We account for under reported figures, and show how much larger the first wave was compared to the second

In March and April, the Government of Ireland wasn’t able to put in place widespread testing for COVID-19. We didn’t hit 100,000 tests per week until Mid-April, and it was September before we reached one million tests per week. We can assume that this led to under-reporting of cases, and that people who had a COVID-19 infection (asymptomatic) were not diagnosed and reported.

But how many cases of COVID-19 were we not detected in the early months of the pandemic? In this article we use methods developed by the London School of Hygeine and Tropical Medicine to estimate the real size of the initial wave of COVID-19.

You can read more about how we did it below, but this plot shows our best estimate of the number of cases from March up until early November:

The Figure suggests that the first wave had massively more cases than that of the second.

To infer the level of case under-reporting, we estimated what is known as the case under-reporting ratio. This is the ratio of reported symptomatic cases to the true number of (symptomatic) infected individuals. We calculated this from the confirmed number of deaths, the confirmed number of cases, and the case fatality rate of the virus (initially here assumed to be 1.4%, with a confidence interval of 1.2-1.5%).

We start with the estimated case fatality rate and the number of reported deaths. Now, if a person dies of COVID-19 on the 15th May, for example, we can guess that they contracted the virus about 2-3 weeks before that date. If there were 500 cases reported on the 1st May and the case fatality rate was 1%, we would expect there to be around 5 deaths on the 15th May. If there were actually 20 deaths, it means we have under-estimated the number of case by a factor of 4. We have an under-reporting ratio of 25%.

In other words, if the observed number of deaths differs a lot from the number of expected deaths then, assuming that the case fatality rate is a good approximation of the truth, we can infer that the reported number of cases is incorrect and underestimated.

Clever maths now takes over, and we use a statistical smoothing model known as a Gaussian Process to calculate the expected under-reporting rate over time. This has the advantage of giving us an uncertainty estimate on the number of cases, as you can see in the blue uncertainty ribbon in the plots - the wider the ribbon, the larger the uncertainty. In Figure 2 we show the results for the estimated percentage of symptomatic cases actually being reported from mid-March to the start of November.