On epidemic modeling in real time: An application to the 2009 Novel A (H1N1) influenza outbreak in Canada
© Hsieh et al; licensee BioMed Central Ltd. 2010
Received: 21 September 2010
Accepted: 5 November 2010
Published: 5 November 2010
Skip to main content
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact us so we can address the problem.
© Hsieh et al; licensee BioMed Central Ltd. 2010
Received: 21 September 2010
Accepted: 5 November 2010
Published: 5 November 2010
Management of emerging infectious diseases such as the 2009 influenza pandemic A (H1N1) poses great challenges for real-time mathematical modeling of disease transmission due to limited information on disease natural history and epidemiology, stochastic variation in the course of epidemics, and changing case definitions and surveillance practices.
The Richards model and its variants are used to fit the cumulative epidemic curve for laboratory-confirmed pandemic H1N1 (pH1N1) infections in Canada, made available by the Public Health Agency of Canada (PHAC). The model is used to obtain estimates for turning points in the initial outbreak, the basic reproductive number (R0), and for expected final outbreak size in the absence of interventions. Confirmed case data were used to construct a best-fit 2-phase model with three turning points. R0 was estimated to be 1.30 (95% CI 1.12-1.47) for the first phase (April 1 to May 4) and 1.35 (95% CI 1.16-1.54) for the second phase (May 4 to June 19). Hospitalization data were also used to fit a 1-phase model with R0 = 1.35 (1.20-1.49) and a single turning point of June 11.
Application of the Richards model to Canadian pH1N1 data shows that detection of turning points is affected by the quality of data available at the time of data usage. Using a Richards model, robust estimates of R0 were obtained approximately one month after the initial outbreak in the case of 2009 A (H1N1) in Canada.
Epidemics and outbreaks caused by emerging infectious diseases continue to challenge medical and public health authorities. Outbreak and epidemic control requires swift action, but real-time identification and characterization of epidemics remains difficult . Methods are needed to inform real-time decision making through rapid characterization of disease epidemiology, prediction of short-term disease trends, and evaluation of the projected impacts of different intervention measures. Real-time mathematical modeling and epidemiological analysis are important tools for such endeavors, but the limited public availability of information on outbreak epidemiology (particularly when the outbreak creates a crisis environment), and on the characteristics of any novel pathogen, present obstacles to the creation of reliable and credible models during a public health emergency. One needs to look no further than the 2003 SARS outbreak, or ongoing concerns related to highly pathogenic avian influenza (H5N1) or bioterrorism to be reminded of the need for and difficulty of real-time modeling.
The emergence of a novel pandemic strain of influenza A (H1N1) (pH1N1) in spring 2009 highlighted these difficulties. Early models of 2009 pH1N1 transmission were subject to substantial uncertainties regarding all aspects of this outbreak, resulting in uncertainty in judging the pandemic potential of the virus and the implementation of reactive public health responses in individual countries (Fraser et al. ). Multiple introductions of a novel virus into the community early in the outbreak could further distort disease epidemiology by creating fluctuations in incidence that are misattributed to the behavior of a single chain of transmission.
We sought to address three critical issues in real time disease modeling for newly emerged 2009 pH1N1: (i) to estimate the basic reproduction number; (ii) to identify the main turning points in the epidemic curve that distinguish different phases or waves of disease; and (iii) to predict the future course of events, including the final size of the outbreak in the absence of intervention. We make use of a simple mathematical model, namely the Richards model, to illustrate the usefulness of near real-time modeling in extracting valuable information regarding the outbreak directly from publicly available epidemic curves. We also provide caveats regarding inherent limitations to modeling with incomplete epidemiological data.
The accuracy of any modeling is highly dependent on the epidemiological characteristics of the outbreak considered, and most epidemic curves exhibit multiple turning points (peaks and valleys) during the early stage of an outbreak. While these may be due to stochastic ("random") variations in disease spread, and changes in either surveillance methods or case definitions, turning points may also represent time points where epidemics transition from exponential growth processes to processes that have declining rates of growth, and thus may identify effects of disease control programs, peaks of seasonal waves of infection, or natural slowing of growth due to infection of a critical fraction of susceptible individuals. For every epidemic, there is a suitable time point after which a given phase of an outbreak can be suitably modeled, and beyond which subsequent phases may be anticipated. Detection of such "turning points" and identification of different phases or waves of an outbreak is of critical importance in designing and evaluating different intervention strategies.
Here the prime "'" denotes the rate of change with respect to time. The model parameter K is the maximum case number (or final outbreak size) over a single phase of outbreak, r is the per capita growth rate of the infected population, and a is the exponent of deviation. The solution of the Richards model can be explicitly given in terms of model parameters as , and the parameter tm is related to the turning point ti of the epidemic (or the inflection point of the cumulative case curve) by the simple formula t m = t i + (lna)/n where ln denotes the natural logarithm. Using the Richard model, we are able to directly fit empirical data from a cumulative epidemic curve to obtain estimates of epidemiological meaningful parameters, including the growth rate r.
In such a model formulation, the basic reproduction number R0 is given by the formula R0 = exp(rT) where T is the disease generation time defined as the average time interval from infection of an individual to infection of his or her contacts. It has been shown mathematically  that, given the growth rate r, the equation R0 = exp(rT) provides the upper bound of the basic reproduction number regardless of the distribution of the generation interval used, assuming there is little pre-existing immunity to the pathogen under consideration. Additional technical details regarding the Richards model can be found in [5–7].
Unlike the better-known deterministic compartmental models used to describe disease transmission dynamics, the Richards model considers only the cumulative infected population size. This population size is assumed to have saturation in growth as the outbreak progresses, and this saturation can be caused by immunity, by implementation of control measures or other factors such as environmental or social changes (e.g., children departing from schools for summer holiday). The basic premise of the Richards model is that the incidence curve of a single phase of a given epidemic consists of a single peak of high incidence, resulting in an S-shaped cumulative epidemic curve with a single turning point for the outbreak. The turning point or inflection point, defined as the time when the rate of case accumulation changes from increasing to decreasing (or vice versa) can be easily pinpointed as the point where the rate of change transitions from positive to negative; i.e., the moment at which the trajectory begins to decline. This time point has obvious epidemiologic importance, indicating either the beginning of a new epidemic phase or the peak of the current epidemic phase.
For epidemics with two or more phases, a variation of the S-shaped Richards model has been proposed . This multi-staged Richards model distinguishes between two types of turning points: the initial S curve which signifies the first turning point that ends initial exponential growth; and a second type of turning point in the epidemic curve where the growth rate of the number of cumulative cases begins to increase again, signifying the beginning of the next epidemic phase. This variant of Richards model provides a systematic method of determining whether an outbreak is single- or multi-phase in nature, and can be used to distinguish true turning points from peaks and valleys resulting from random variability in case counts. More details on application of the multi-staged Richards model to SARS can be found in [6, 7]. Readers are also referred to [8, 9] for its applications to dengue.
We fit both the single- and multi-phase Richards models to Canadian cumulative 2009 pH1N1 cumulative case data, using publicly available disease onset dates obtained from the Public Health Agency of Canada (PHAC) website [10, 11]. PHAC data represent a central repository for influenza case reports provided by each of Canada's provinces and territories. Onset dates represent best local estimates, and may be obtained differently in different jurisdictions. For example, the province of Ontario, which comprises approximately 1/3 of the population of Canada, and where most spring influenza activity was concentrated, replaces onset dates using a hierarchical schema, whereby missing onset dates may be replaced with dates of specimen collection (if known) or date of specimen receipt by the provincial laboratory system, if both dates of onset and specimen collection are missing.
Data were accessed at different time points during the course of the "spring wave (or herald wave)" of the epidemic in May-July of 2009, whenever a new dataset is made available online by the PHAC. By sequentially considering successive S-shaped segments of the epidemic curve, we estimate the maximum case number (K) and locate turning points, thus generating estimates for cumulative case numbers during each phase of the outbreak. The PHAC cumulative case data is then fitted to the cumulative case function C(t) in the Richards model with the initial time t 0 = 0 being the date when the first laboratory confirmed case was reported and the initial case number C 0 = C(0) = 1, (the case number with onset of symptoms on that day). There were some differences between sequential epidemic curves in assigned case dates. For example, data posted by PHAC on May 20 indicated an initial case date of April 13, but in the June 3 data this had been changed to April 12, perhaps due to revision of the case date as a result of additional information.
Model parameter estimates based on the explicit solution given earlier can be obtained easily and efficiently using any standard software with a least-squares approximation tool, such as SAS or Matlab.
Daily incidence data by onset date were posted by PHAC until June 26, after which date only the daily number of laboratory-confirmed hospitalized cases in Canada was posted. For the purpose of comparison, we also fit the hospitalization data to the Richards model in order to evaluate temporal changes in the number of severe (hospitalized) cases, which are assumed to be approximately proportional to the total cases number. The case and hospitalization data used in this work are provided online as Additional file 1.
Estimation results for Richards model parameters for various time periods of Canadian daily laboratory-confirmed pandemic H1N1 virus data by onset date
Time period (date posted)
Turning point ti(95% CI)
Growth rate r (95% CI)
Turning point ti
Comparison of Akaike information criterion (AIC) values between 1-phase and 2-phase models for time periods with 2-phase model fit in Table 1
Time period (date posted)
As noted above, model can be used to estimate turning points (ti) and basic reproductive numbers (R0.), if the generation time T is know. We used T = 1.91 days (95% CI: 1.30-2.71), as obtained in  by fitting an age stratified mathematical model to the first recognized 2009 influenza A (H1N1) outbreak in La Gloria, Mexico. Estimates are presented in Table 1. We also conducted sensitivity analyses with R0# calculated based on longer generation times (T = 3.6 (2.9, 4.3)) for seasonal influenza in  (see last column in Table 1). Excluding implausibly high estimates of R0 generated using initial outbreak data (April 13 to May 15), we obtain the estimates of R0 for the 2-phase model that range between 1.31 and 1.96.
Comparison of estimation results of the model parameters for the 1-phase Richards model using daily laboratory-confirmed pandemic H1N1 virus hospitalization data
Time period of data (date posted)
Estimated turning point ti
1.35 (1.20, 1.49)
1.75 (1.55, 1.95)
We used the Richards model, which permits estimation of key epidemiological parameters based on cumulative case counts, to study the initial wave of 2009 influenza A (H1N1) cases in Canada. In most model fits, April 28-29 and May 4-7 were identified as early turning points for the outbreak, with a third and final turning point around June 3-5 in models based on longer time series. Although this modeling approach was not able to detect turning points using some earlier data sets (e.g., those limited to the period from April 12 to May 27), in general the turning points identified were consistent across multiple models and time series. Perhaps the most important divergence between models occurred with the detection of an April 29 turning point in the case report time series, but not in the time series based on hospitalized cases. We believe this may be attributable to the small number of hospitalizations, relative to cases, that had occurred by that date, as well as the fact that hospitalization data only became available on April 18.
The turning point can correspond to the point at which disease control activities take effect (such that the rate of change in epidemic growth begins to decline) or can represent the point at which an epidemic begins to wane naturally (for example, due to seasonal shifts or due to the epidemic having "exhausted" the supply of susceptibles such that the reproductive number of the epidemic declines below 1). This quantity has direct policy relevance; for example, in the autumn 2009 pH1N1 wave in Canada, vaccination for pH1N1 was initiated at or after the turning point of the autumn wave due to the time taken to produce vaccine; as the epidemic was in natural decline at that point, the impact of vaccination has subsequently been called into question.
In addition to identifying turning points, the Richards model is useful for estimation of the basic reproductive number (R0) for an epidemic process, and our estimates derived using a Richards model were consistent with estimates derived using other methods. For example, our R0 agrees almost perfectly with that of Tuite et al., derived using a Markov chain Monte Carlo simulation parameterized with individual-level data from Ontario's public health surveillance system . Our estimates of R0 is smaller than that derived by Fraser et al.  using Mexican data, but such differences could relate in part to the different age distributions of these two countries , and may also reflect the fact that our estimate is obtained Canadian data at a national level, while empirical Mexican estimates were based on data from the town of La Gloria with only 1575 residents.
Most epidemic curves in the early stage of a novel disease outbreak have multiple phases or waves due to simple stochastic ("random") variation, mechanisms of disease importing, initial transmission networks and individual/community behavior changes, improvements in the performance of surveillance systems, or changes in case definitions as the outbreak response evolves. However, changes in phase (signified by the presence of turning points identified using the Richards model) may also pinpoint the timing of important changes in disease dynamics, such as effective control of the epidemic via vaccination or other control measures, depletion of disease-susceptible individuals (such that the effective reproductive number for the disease decreases to < 1), or the peak of a "seasonal" wave of infection, as occurs with both seasonal and pandemic influenza. One of the advantages of the Richards model for real-time modeling is that it offers a simple means of fitting a model to cumulative case counts, which smoothes out stochastic variations in the epidemic curve.
While there are numerous published methodologies in addition to the Richards model that can be used to estimate R0[4, 18, 19], some competing methods require more extensive and detailed data than are required to build a Richards model, which requires only cumulative case data from an epidemic curve. As we also demonstrate here, the Richards model produces fairly stable and credible estimates of reproductive numbers early in the outbreak, allowing these estimates to inform evolving disease control strategies. For example, the first estimates in Table 1, derived using early case data accessed on May 20, closely approximate our final estimates (Table 1, last row). Thus, while early estimation with the Richards model failed to correctly detect turning points or accurately estimate the final outbreak size, it was nonetheless useful for rapid estimation of R0 within a month of first case occurrence in Canada.
As with any mathematical modeling technique, the approach presented here is subject to limitations, which include data quality associated with real-time modeling (as data are often subject to ongoing cleaning, correction, and reclassification of onset dates as further data become available), reporting delays, and problems related to missing data (which may be non-random). In our current study, the hierarchical approach used by Canada's most populous province (Ontario) for replacement of missing data could have had distorting effects on measured disease epidemiology: the replacement of missing onset dates with dates of specimen collection could have resulted in the artifactual appearance of early turning points identified by our model, due to limitations in weekend staffing early in the outbreak. If, as we believe to be the case, public health laboratories did not have sufficient emergency staffing to keep up with testing on weekends such that weekend specimen log-ins declined sharply, this would have created the appearance of epidemic "fade out" on weekends. Other factors that might distort the apparent epidemiology of disease include changes in guidelines for laboratory testing of suspected cases, improved surveillance and public health alerts at later stages of the outbreak leading to increased case ascertainment or over-reporting of cases . However, the quality of the time series will tend to improve with the duration of the epidemic, both because stochastic variation is "smoothed out", and also because small variations become less important as the cumulative series becomes longer. We note that a further application of the Richards model in the context of influenza would relate to comparison of the epidemiology of the 2009 influenza A H1N1 epidemic to past Canadian epidemics, though such an endeavor is beyond the scope of the present study.
In summary, we believe that the Richards model provides an important tool for rapid epidemic modeling in the face of a public health crisis. However, predictions based on the Richards model (and all other mathematical models) should be interpreted with caution early in an epidemic, when one need to balance urgency with sound modeling. At their worst, hasty predictions are not only unhelpful, but can mislead public health officials, adversely influence public sentiments and responses, undermine the perceived credibility of future (more accurate) models, and become a hindrance to intervention and control efforts in general.
YHH is supported by grants (NSC 97-2118-M-039-004, 97-2314-B-039 -013 -MY3) and (CMU 97 323), and by travel funding provided by MITACS. DF is supported by MITACS, CIHR, and the Ontario Early Researcher Award program; JWU is supported by CRC, CIHR, GEOIDE, NSERC and MITACS. The authors are grateful to the reviewers for their insightful comments and suggestions which have significantly improved this manuscript.
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.