Skip to main content

Estimating the wave 1 and wave 2 infection fatality rates from SARS-CoV-2 in India



There has been much discussion and debate around the underreporting of COVID-19 infections and deaths in India. In this short report we first estimate the underreporting factor for infections from publicly available data released by the Indian Council of Medical Research on reported number of cases and national seroprevalence surveys. We then use a compartmental epidemiologic model to estimate the undetected number of infections and deaths, yielding estimates of the corresponding underreporting factors. We compare the serosurvey based ad hoc estimate of the infection fatality rate (IFR) with the model-based estimate. Since the first and second waves in India are intrinsically different in nature, we carry out this exercise in two periods: the first wave (April 1, 2020–January 31, 2021) and part of the second wave (February 1, 2021–May 15, 2021). The latest national seroprevalence estimate is from January 2021, and thus only relevant to our wave 1 calculations.


Both wave 1 and wave 2 estimates qualitatively show that there is a large degree of “covert infections” in India, with model-based estimated underreporting factor for infections as 11.11 (95% credible interval (CrI) 10.71–11.47) and for deaths as 3.56 (95% CrI 3.48–3.64) for wave 1. For wave 2, underreporting factor for infections escalate to 26.77 (95% CrI 24.26–28.81) and to 5.77 (95% CrI 5.34–6.15) for deaths. If we rely on only reported deaths, the IFR estimate is 0.13% for wave 1 and 0.03% for part of wave 2. Taking underreporting of deaths into account, the IFR estimate is 0.46% for wave 1 and 0.18% for wave 2 (till May 15). Combining waves 1 and 2, as of May 15, while India reported a total of nearly 25 million cases and 270 thousand deaths, the estimated number of infections and deaths stand at 491 million (36% of the population) and 1.21 million respectively, yielding an estimated (combined) infection fatality rate of 0.25%. There is considerable variation in these estimates across Indian states. Up to date seroprevalence studies and mortality data are needed to validate these model-based estimates.


Main text

In late August 2020, India was predicted to surpass the United States in terms of reported case counts from SARS-CoV-2 infections. To the surprise of many modelers the curve turned corner in late September with the highest number (97,894) of daily new cases reported on 16 September 2020 [1]. After a steady decline for nearly five months, the curve started rising again, growing into an astronomic second wave. The highest number (414,280) of daily new cases in wave 2 was reported on May 6, 2021. As of May 15, 2021, India has reported 24.7 million cases, the second highest in the world, and nearly 270 thousand deaths, the third highest in the world. In this brief report, we reconcile estimates of the infection fatality rate (IFR) inferred from seroprevalence studies with epidemiologic model-based estimates that account for underreporting of infections and deaths in India for wave 1. We then proceed to compute, compare and combine wave 1 with wave 2 IFR estimates.


Synthesizing evidence from seroprevalence studies

We review available seroprevalence results that vary across states and specifically across rural versus urban areas. Whereas in many major metros and slum areas the seroprevalences were reported to be more than 50%, in rural areas there is a wide variation (Table 1). The latest national serosurvey (from 17 December 2020 to 8 January 2021) reports 21.4% of all Indians above age 18 have antibodies present that indicate past SARS-CoV-2 infection [2]. Since approximately 59% [3] of India’s 1.36 billion citizens are above age 18 and 10.45 million infections were reported as of 8 January 2021, this points to approximately 172.47 million infections, with an implied underreporting factor of 16.5 (172.47/10.45). In other words, only 6% of India’s COVID-19 infections are reported, while 94% remained undetected or unreported. We use this estimated number of infections to calculate the IFR. Regional studies based on crematorium data and counting obituaries in India have suggested an underreporting factor in the range of 2 to 5 for COVID-deaths; this is at best ad hoc and anecdotal in nature and no rigorous quantification of missing death numbers is currently available [4].

Table 1 Summary of results from various serological surveys conducted in India during 2020–21

Model-based estimates

Using a compartmental epidemiologic model (as explained in the Supplementary Methods) with a compartment for unascertained cases and deaths after accounting for the false negative rates of RT-PCR and rapid antigen tests used in India [5] we estimate the national and state-level IFR in India by inferring underreporting factors for cases and deaths. We assume that the estimated total infections (deaths) are comprised of reported and unreported infections (deaths). The model divides the population into ten disjoint compartments: S (Susceptible), E (Exposed), T (Tested), U (Untested), P (Tested positive), F (Tested False Negative), RR (Reported Recovered), RU (Unreported Recovered), DR (Reported Deaths) and DU (Unreported Deaths), as described in Additional file 1: Figure S1. A set of nine differential equations govern the transmission dynamics, which are approximated by means of discrete recurrence relations. For any compartment \(X\), the instantaneous rate of change at time \(t\) (given by \(\frac{{dX}}{{dt}}\)) is approximated by the difference of counts in that specific compartment on the \(\left( {t + 1} \right)\) th day and the \(\left( t \right)\) th day, i.e., say \(X\left( {t + 1} \right) - X\left( t \right)\). Parameters are estimated using Bayesian techniques by generating samples from the posterior distribution using a Metropolis–Hastings algorithm with Gaussian proposal density, with 95% credible intervals (CrI) to quantify uncertainty of the estimates. Additional file 1: Table T4 presents an overview of the parameter descriptions and settings for this model.

Comparing and combining waves 1 and 2

Due to the stress on the healthcare and reporting infrastructure, the fatality and underreporting processes were very different across the two waves. Thus, we consider two separate phases of the pandemic, with wave 1 from April 1, 2020–January 31, 2021 and wave 2 starting on February 1, 2021. This definition is artificial and is guided by the fact that the national effective reproduction number (Reff) crossed unity for the first time in 2021 on February 14 and we allow a two-week incubation period before that date. Using daily time series of case, death and recovery counts we compare fatality rates and underreporting factors associated with the two time periods using the compartmental models. Further, using observed data from the two waves and the model-based underreporting factor estimates, we compute cumulative case and death counts for the total duration of waves 1 and 2. We multiply the wave-specific cumulative counts with relevant underreporting factors and sum over both waves to get combined counts of cases and deaths. The estimated numbers of cumulative deaths and infections provide us with a combined IFR estimate for India as of May 15.


IFR estimates for wave 1 using seroprevalence surveys

The observed case fatality rate (CFR) in India is low. With 154,428 deaths and 10.76 million cases reported as of January 31, 2021 the estimated CFR for wave 1 is 1.435% (95% confidence interval 1.428–1.442%) [1]. The estimated number of infections from the January seroprevalence survey imply an approximate infection fatality rate of 0.09% (i.e. 154,428/172.47 M). The anecdotal underreporting factor for deaths (in the range of 2–5) implies an ad hoc estimate of IFR in the range of 0.19–0.45%.

Estimates from epidemiological models

For wave 1 our estimate for the national IFR1 (observed cumulative deaths/estimated cumulative total infections) is 0.129% (95% CrI 0.125–0.134%) and IFR2 (estimated total cumulative deaths/estimated total cumulative infections) is 0.461% (95% CrI 0.455–0.468%) with an underreporting factor for cases estimated at 11.11 (95% CrI 10.71–11.47) and for deaths at 3.56 (95% CrI 3.48–3.64). These model-based estimates in wave 1 are largely consistent with the estimates from the latest and third nationwide seroprevalence study.

In wave 2, using the same model we see a stark contrast with wave 1, with case and death underreporting factor estimates escalate to 26.73 (95% CrI 24.26–28.81) and 5.77 (95% CrI 5.34–6.15) respectively, leading to IFR1 estimate of 0.032% (95% CrI 0.029–0.035%) and IFR2 estimate of 0.183% (95% CrI 0.18–0.186%). This pattern is consistent with wave 2 CFR being estimated at 0.845% (95% CrI 0.840–0.849%), 59% of wave 1 estimate.

Figure 1 shows underreporting factors and estimated infections and deaths in waves 1 and 2 for India while Fig. 2 highlights state-level variations in IFR1, IFR2, CFR for waves 1 and 2 for 20 states in India with large case/death counts.

Fig. 1
figure 1

Comparison of observed and estimated case and death counts and associated underreporting factors from waves 1, 2 and both waves combined

Fig. 2
figure 2

Forest plot of wave 1 and wave 2 infection fatality rates (IFR) and case fatality rates (CFR) associated with SARS-CoV-2 in various states in India. IFR1 is based on reported deaths whereas IFR2 estimates and includes the unreported deaths

Combining waves 1 and 2

The composite CFR as of May 15 stands at 1.1%. The estimate for total (reported + unreported) cumulative case count for waves 1 and 2 combined is 491.73 (95% CrI 453.03–524.56) million, while the estimated number of total (reported + unreported) deaths is 1216.35 (95% CrI 1154.21–1272.70) thousand. This leads to a combined IFR1 estimate of 0.06% and IFR2 estimate of 0.24%. Detailed numerical estimates of underreporting factors across states for waves 1 and 2 are presented in Additional file 1: Tables T1, T2 and T3 and Additional file 1: Figures S2 and S3.


Despite accounting for underreported deaths, the large number of asymptomatic/undetected infections (more than 90% by any calculation) indicate a lower IFR in India in comparison with other Western countries. A meta-analysis across the world places the pooled mean of IFRs at 0.68% (95% CI: 0.53–0.82%) [6], while another meta-analysis places the median at 0.27% [7] (with a range of 0–1.63%). Seroprevalence surveys and epidemiologic models qualitatively agree on the estimated IFR for India for wave 1. Up to date serosurvey and excess death/mortality data are needed to validate wave 2 and combined estimates. The estimated number of total infections as of May 15 suggests roughly 36% of Indians have an active or past infection, a number that will need to be verified with synchronous sero-surveys.

The current reduction in fatality rates in wave 2 that we notice could be primarily due to two reasons, one is that we do not have the same length of follow-up period and complete data on the decay phase of wave 2 curve. The second could be the different age composition of the infected populations in the two waves; it has been reported that the younger population got infected in larger numbers in wave 2 and they have lower risk of COVID-19 mortality. A fraction of the older population (aged 65 + years) also got vaccinated during wave 2. However, this hypothesis about reduced fatality rates in wave 2 cannot be verified without more granular, age-sex stratified nationwide time-series data on case and death counts, which is currently unavailable.


We do not have a rigorous way to validate the extent of underreporting of deaths. An excess death calculation based on historical mortality data is infeasible at this point due to absence of all-cause-mortality data in the last three years from India. India has a very young population with only 6.4% in age group 65 + (compared to the US where this proportion is 16.5%) so a comparison of overall IFR between India and say the US is not fair, and only age-specific IFRs should be calculated and compared when more data become available. We do recognize that wave 2 information is appreciably incomplete, and the estimates will change as we have more complete information on deaths. For example, while our wave 2 analysis period ended on May 15, the highest daily number of deaths (4529 daily new deaths) were reported shortly after on May 18. Thus, our analysis presents an updated but incomplete picture of wave 2.

Availability of data and materials

All data are publicly available at We used reported daily case, death and recovery counts for India and its states and union territories from April 1, 2020 to May 15, 2021. The statistical package SEIR-Fansy developed by the authors is available at



Case fatality rate


Confidence interval


Credible interval


Infection fatality rate


Real time reverse transcript polymerase chain reaction




Underreporting Factor


  1. COVID-19 Tracker Updates for India for State-wise and District-wise data. Published 2020. Accessed 21 May 2020

  2. Press Trust of India. Over 21% of India’s population may have had COVID-19, shows sero survey. Published February 4, 2021.

  3. Wikipedia. Demographics of India: structure of the population. Accessed 20 Feb 2021

  4. The Wire. COVID-19: Cremation, burial records suggest Delhi’s death toll is over twice the official figure. Published May 27, 2020. Accessed 20 Feb 2021

  5. Bhaduri R, Kundu R, Purkayastha S, Kleinsasser M, Beesley LJ, Mukherjee B. Extending the Susceptible-Exposed-Infected-Removed (SEIR) model to handle the high false negative rate and symptom-based administration of COVID-19 diagnostic tests: SEIR-fansy. Epidemiology. 2020.

    Article  Google Scholar 

  6. Meyerowitz-Katz G, Merone L. A systematic review and meta-analysis of published research data on COVID-19 infection fatality rates. Int J Infect Dis. 2020;101:138–48.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Ioannidis JPA. Infection fatality rate of COVID-19 inferred from seroprevalence data. Bull World Health Organ. 2021;99(1):19-33F.

    Article  PubMed  Google Scholar 

Download references


The authors are grateful for the computational resources available to them via the advanced research computing center at the University of Michigan.


The research was supported by an internal pilot grant at the University of Michigan, awarded by the Michigan Institute of Data Science (MIDAS).

Author information

Authors and Affiliations



SP created the initial draft, conducted the literature review and collaborated on the analysis. RK and RB created the R package SEIR-fansy and implemented the epidemiologic models. DB and MK collaborated on analysing data from the second wave. DR helped create and modify the final draft and address issues raised by the review panel. BM conceived the project, planned the analysis and wrote the draft of the paper. All authors read, reviewed, edited and approved the manuscript for submission.

Corresponding author

Correspondence to Bhramar Mukherjee.

Ethics declarations

Ethics approval and consent to participate

Not applicable. The analysis is based on publicly available completely de-identified aggregate counts. The study is exempt from IRB review as no patient participation or contact is involved.

Consent for publication

Not applicable.

Competing interests

There are no conflicts of interest perceived or declared by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

: Figure S1. Schematic diagram for the SEIR-fansy model with imperfect testing and misclassification. Figure S2. Estimated first wave underreporting factors for cases and deaths associated with SARS-CoV-2 for states in India. Figure S3. Estimated first wave underreporting factors for cases and deaths associated with SARS-CoV-2 for states in India. Table T1. Summary of the different metrics for the states and the nation for wave 1, on 31st January, 2021. Table T2. Summary of the different metrics for the states and the nation for wave 2, on 15th May, 2021. Table T3. Summary of the different metrics for the states and the nation for waves 1 and 2 combined, on 15th May, 2021. Table T4. Parameter values and descriptions for the SEIRfansy model.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Purkayastha, S., Kundu, R., Bhaduri, R. et al. Estimating the wave 1 and wave 2 infection fatality rates from SARS-CoV-2 in India. BMC Res Notes 14, 262 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: