Validation of anaemia, haemorrhage and blood disorder reporting in hospital data in New South Wales, Australia
BMC Research Notes volume 14, Article number: 167 (2021)
Hospital data are a useful resource for studying pregnancy complications, including bleeding-related conditions, however, the reliability of these data is unclear. This study aims to examine reliability of reporting of bleeding-related conditions, including anaemia, obstetric haemorrhage and blood disorders, and procedures, such as blood transfusion and hysterectomy, in coded hospital records compared with obstetric data from two large tertiary hospitals in New South Wales.
There were 36,051 births between 2011 and 2015 included in the analysis. Anaemia and blood disorders were poorly reported in the hospital data, with sensitivity ranging from 2.5% to 24.8% (positive predictive value (PPV) 12.0–82.6%). Reporting of postpartum haemorrhage, transfusion and hysterectomy showed high sensitivity (82.8–96.0%, PPV 78.0–89.6%) while moderate consistency with the obstetric data was observed for other types of obstetric haemorrhage (sensitivity: 41.9–65.1%, PPV: 50.0–56.8%) and placental complications (sensitivity: 68.2–81.3%, PPV: 20.3–72.3%). Our findings suggest that hospital data may be a reliable source of information on postpartum haemorrhage, transfusion and hysterectomy. However, they highlight the need for caution for studies of anaemia and blood disorders, given high rates of uncoded and ‘false’ cases, and suggest that other sources of data should be sought where possible.
Hospital data are an efficient, cost-effective resource to study trends and outcomes of medical conditions and procedures [1,2,3,4,5], including during pregnancy. The range of procedures and conditions that can affect pregnant women and their babies are not always captured in obstetric data. Obstetric data are often recorded in a form difficult to analyse, such as free text, and are less commonly available on a population level than hospital data. Hospital data, comprised of coded diagnoses and procedures following international standards, provide a useful alternative or supplement to obstetric data for the purposes of research [1,2,3,4,5,6]. However, the reliability of findings depends on the extent to which hospital data accurately identify patients with and without the condition or procedure. Discrepancies in hospital data can occur at various stages of the recording process, including initial documentation, coding, and data entry, and accuracy may change over time, influenced by changes in practice, guidelines, and focus on particular conditions.
Anaemia, obstetric haemorrhage, blood disorders such as coagulation and platelet disorders, and related procedures, are associated with adverse maternal and neonatal outcomes [7,8,9,10]. Previously, nutritional anaemia , placental abruption [12, 13] and hysterectomy  have been shown to have poor sensitivities, while haemolytic anaemia , transfusion , and coagulation disorders  have been shown to have good sensitivities in hospital data. However, these studies were based on births in 2000  and 2002 [11, 13], and it is not known whether reporting has changed in the intervening years.
Validation studies comparing coded hospital data with medical charts can be used to assess the reliability of data sources, but are time consuming, expensive, and tend to review a small number of records and short time period. An alternative approach is to compare two independent databases [14,15,16]. Here, we compare reporting of bleeding-related conditions and procedures in pregnancy in coded hospital data extracted from the electronic medical record to obstetric data from the ObstetriX database, using ObstetriX as the reference standard.
The hospital data are coded following the International Classification of Diseases (ICD) and Australian Classification of Health Interventions by trained clinical coders, using clinical documentation during an inpatient episode of care to assign the appropriate diagnosis, and where relevant, procedure code(s). Government policy and ICD coding standards mostly limit what is coded to conditions affecting the current admission, require substantiation by clear medical record documentation, and prohibit interpretation of results . Coded data are used to facilitate activity based funding, healthcare management and planning, and also inform the population-level New South Wales Admitted Patient Data Collection.
ObstetriX is a clinical database specific to the pregnancy, birth and early postnatal period, collected by midwives, a subset of which forms the statewide New South Wales Perinatal Data Collection. This population-based data collection has shown high levels of accuracy for reporting of diagnoses and procedures during labour and delivery , and validation studies of similar Australian data collections such as the Victorian Perinatal Data Collection have shown high accuracy for most data items [19,20,21]. Given the different purposes and perspectives of the databases, ObstetriX may be considered an imperfect reference standard, however most reference standards are not without error and uncertainty, particularly where self-report is relied upon [22,23,24], and large population datasets have been shown to be robust to the introduction of random errors and omissions .
The aim of this study is to determine the consistency of reporting of anaemia, haemorrhage and blood disorders during pregnancy, and related procedures including blood transfusion and hysterectomy, in hospital records compared with an obstetric database. We aimed to compare reporting between two large tertiary hospitals, and to determine whether patient or pregnancy characteristics affect reporting.
Women giving birth to singleton infants (≥ 24 weeks gestation) in two tertiary hospitals in the Sydney metropolitan area, New South Wales (NSW), Australia, between 2011 and 2015 were included. In NSW, all births in a hospital or birth centre are treated as inpatient admissions and assigned an electronic medical record and an obstetric record. Delivery in hospital and birth centres represent 99% of births in the state . Women who had prearranged to give birth in a different hospital to the hospital of birth were excluded, because antenatal data would have been collected at the hospital of booking and may be incomplete.
ObstetriX contains maternal health and demographic data, obstetric history and pregnancy details. Initial information is obtained on pregnancy and medical history at the face-to-face booking consultation with a midwife (an outpatient encounter, by 16 weeks gestation). The record is updated with labour, birth and postnatal information collected during the birth admission. Data are entered by midwives and recorded in checkboxes or drop-down menus, with a small amount of free text available. The majority of procedures and conditions are recorded as present, absent or unknown/missing. The midwives do not have access to the hospital codes, as coding is performed four to six weeks after discharge.
The coded hospital data contain diagnoses coded according to the International Classification of Diseases, Tenth Revision, Australian Modification (ICD-10-AM) with a small number coded using Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) codes. Procedures are coded following the Australian Classification of Health Interventions, Eighth Edition. Coding is performed by trained clinical coders based on clinical documentation in the electronic medical record. Medical records for admissions throughout pregnancy and the birth were searched for the relevant diagnoses and procedures (codes provided in Additional file 1: Table S1).
Records were deterministically linked using patient Medical Record Numbers and checked using other personal identifiers, by personnel external to the project. Data were de-identified data prior to analysis.
Reporting in hospital data was compared to that in obstetric data as the reference standard. Sensitivity, specificity, positive predictive values (PPV) and negative predictive values (NPV) are reported with exact confidence intervals. These measures were also calculated separately for the two hospitals. Analyses were performed in SAS 9.3.
Of the 38,343 singleton births of at least 24 weeks gestation between 2011 and 2015, 36,051 (95.1%) gave birth at their booking hospital and were included in the analysis.
Sensitivities ranged from very low (2.5%, thalassaemia) to high (96.0%, hysterectomy) (Table 1). Sensitivity and PPV for anaemia were poor (sensitivity: 4.4–16.4%, PPV: 12.0–38.1%). Sensitivity for obstetric haemorrhage ranged from 56.3 to 89.7%, with PPV ranging from 50.0 to 91.8%. For blood disorders, sensitivity ranged from 2.5–24.8% and PPV from 15.3–82.6%, and for placental complications sensitivity ranged from 68.2 to 81.3% and PPV from 20.3 to 72.3%. Specificities were high for all conditions and procedures, as were NPVs. The observed patterns were similar for the individual hospitals (Table 2).
Rates of diagnoses were similar across the two data sources, with the exception of anaemia (haemoglobin < 110 g/L) in pregnancy, B12 deficiency anaemia and coagulation disorders (Table 1). For anaemia, platelet disorders, coagulation disorders, and to a lesser extent placenta accreta, numbers were similar between the databases, however there was a low level of overlap, with different women identified in each source.
Rates of diagnoses and procedures were similar between hospitals, with the exception of postpartum haemorrhage (PPH), antepartum haemorrhage (APH) and nutritional anaemia, which were higher at Hospital One (Additional file 1: Table 2).
Sensitivity, specificity, PPV and NPV for reporting of anaemia were similar for nulliparous and parous women (Table 3). Sensitivity and PPV for anaemia tended to increase by year of birth, and were higher for hospital medical, private obstetrician or general practitioner-focussed models of care compared to midwife-centred care. Sensitivity tended to increase, with decreases in specificity and NPV, where women received a blood transfusion compared to where they did not, except for intrapartum haemorrhage (IPH) and APH, where transfusion was associated with decreased sensitivity.
Potential misclassification was identified, with 32.2% of the 58.1% whose IPH was not identified in the hospital data reported with PPH (29 of the 90 missed cases), compared to 16.9% (11 of 65) among those whose IPH was reported in the hospital data. In comparison, among those whose PPH was not identified in the hospital data (745 of 4330), 3.0% were recorded with IPH (22 of 745), compared to 0.6% among those whose PPH reported in ObstetriX (20 of 3565).
All types of anaemia were under-ascertained in ICD-coded hospital data, with a high proportion of cases reported in the hospital data and not in obstetric data. Using combined categories for anaemia did not meaningfully improve sensitivity or PPV. Thalassaemia, platelet disorders, and coagulation disorders were similarly underreported. Reliability for APH and IPH were moderate, and using a broad category for any bleeding before birth improved sensitivity and PPV. Postpartum haemorrhage, transfusion, placental complications (including accreta, praevia, and abruption), and hysterectomy, were reported with moderate to high consistency in the obstetric data. Compared with previous studies, we found better reporting of hysterectomy  and PPH , but poorer reporting for APH , coagulation disorders , placenta praevia [12, 13], haemolytic anaemia , and placenta accreta . Accuracy was similar to previous studies for transfusion [13, 14], placental abruption [12, 13] and nutritional anaemia .
Consistency of reporting of anaemia improved with time and where antenatal care was provided by a hospital medical team, private obstetrician or general practitioner, compared to midwife-centred care. This may be reflective of a lack of an agreed definition of anaemia for pregnant women, amid discrepancies between the National Blood Authority and World Health Organisation definitions . Having a transfusion affected reliability of reporting, with anaemia showing higher sensitivity for patients who received a blood transfusion, however this reduced specificity and PPV remained low. Having a transfusion made little difference to sensitivity for PPH, although it was associated with a reduction in specificity, and transfusion was associated with reduced sensitivity for APH and IPH. Having a transfusion slightly increased sensitivity for thalassaemia, although it remained poorly reported among that group. While data on bleeding severity were not available, previous studies have shown that the likelihood that haemorrhage or transfusion are reported in hospital data increases with higher severity [13, 14]
The inconsistent reporting of anaemia and blood disorders reflects the difficulties in capturing chronic but minor conditions not affecting the current admission. Serious, persistent conditions that are more likely to affect the hospital admission, including diabetes and hypertension, have demonstrated better correlation between datasets . The differences in timing of data collection likely affected consistency, as issues may arise or resolve between early pregnancy and birth, particularly for fluctuating conditions such as iron-deficiency anaemia. Indeed, a recent study of the same study population and datasets as examined here found that 38% of women recorded with low haemoglobin (< 110 g/L) in the first 20 weeks of pregnancy had their haemoglobin level restored (110 g/L or higher) after 20 weeks gestation, with 38% not restored and 24% not recorded, likely reflecting good antenatal care and the recent emphasis on treating anaemia as a pillar of patient blood management . The increase in anaemia prevalence, from 3.9% based on clinical diagnosis in the hospital data compared to 19.4% according to pathology results, however, highlights discrepancies between the language clinicians use in the medical notes and the language required for clinical coding, such that Hb = 69 g/L cannot be interpreted as anaemia by a coder if the words ‘anaemia’ or ‘low Hb’ are not written in the notes . Additionally, a code may not be assigned for a queried diagnosis. Regarding obstetric haemorrhage, while bleeding may occur before labour (APH), during labour but before the birth (IPH) or following birth (PPH), and should be recorded as such, practically, conflation of IPH and PPH is not uncommon and it is the total blood loss rather than the exact timing of bleeding that guides the clinical response. Progressive loss during labour may not be separated into discrete time periods in the documentation or data entry, particularly when IPH occurs close to the time of birth. Hence, a composite measure may be more useful.
There was a lack of overlap in patients recorded with anaemia, platelet disorders, coagulation disorders, and placenta accreta in the two databases. For anaemia, this may be related to the timing of data collection, as discussed above, which may contribute to the high proportion of “false” cases in the hospital data. While one of the strengths of ObstetriX is that it uses systematic pre-specified fields, making omissions less likely than for hospital data, it is likely that for some conditions, obstetric data are incomplete. For transfusion, hospital data are likely more complete than obstetric data given that blood must be dispensed and recorded in patient notes. Previous studies have shown high reliability for transfusion in hospital data compared to blood pack information from transfusion laboratories . For placenta accreta, hospital data are likely more accurate, given that coding considers placental histopathology and procedures such as manual removal of placenta. Obstetric data are largely self-reported, which may contain inaccuracies . Further, data are entered into ObstetriX by midwives, with accuracy sometimes sacrificed when personnel are busy providing clinical care, and the person entering the data was generally not present for the entire episode of care. In light of our findings, we suggest that hospital data may be a more appropriate source for identifying transfusion than obstetric data, while for the remaining conditions, using both data sources to identify cases, possibly corrected with capture-recapture models , is advisable where possible for improving ascertainment. This supports recommendations made elsewhere [18, 31].
Slight differences in the rates of diagnoses between hospitals likely relate to the different demographic compositions, with Hospital Two tending to have a younger obstetric population with a different ethnic and comorbidity mix. However, the overall consistency between the two hospitals suggests that the results hold across different socioeconomic and ethnic patient populations, locations, and facilities.
We found poor reporting of anaemia and blood disorders. Reporting of ante- and intrapartum haemorrhage and placental complications were moderately well reported, while hysterectomy, transfusion and PPH showed high consistency between datasets. Procedures were better reported than conditions, and conditions occurring around the time of birth, such as PPH, were better reported than pre-existing conditions, such as blood disorders. Caution should be exercised in the use of hospital data for studies of anaemia and blood disorders, given both the high rates of missed cases and cases unconfirmed in obstetric data, and other sources of data should be sought where possible. These findings are likely to be reasonably generalizable, given that ICD-10 is widely used [31, 32], and previous studies have shown similar accuracy between hospital data from NSW and elsewhere .
Obstetric data is an imperfect reference standard. The data are collected at different times, which may have affected the results as issues may arise or resolve in early pregnancy and birth. ObstetriX data are largely self-reported, which may be inaccurate , and in some cases may be less complete than hospital data. The implications of these limitations are discussed in relation to the findings above. Data were available from two hospitals only.
Availability of data and materials
The data that support the findings of this study are available from Northern Sydney Local Health District and Western Sydney Local Health District, but privacy and confidentiality restrictions apply to the availability of these data, which were provided following ethical approval for the current study, but are not publicly available. Data are however directly available from the Northern Sydney and Western Sydney Local Health Districts upon request and ethical approval.
International Classification of Diseases
Positive predictive value
Negative predictive value
New South Wales
Andersen TF, Madsen M, Jørgensen J, Mellemkjoer L, Olsen JH. The Danish National Hospital Register. A valuable source of data for modern health sciences. Dan Med Bull. 1999;46(3):263–8.
Cheng H-T, Wang Y-C, Lo H-C, Su L-T, Lin C-H, Sung F-C, Hsieh C-H. Trauma during pregnancy: a population-based analysis of maternal outcome. World J Surg. 2012;36(12):2767–75.
Shaheen AAM, Myers RP. The outcomes of pregnancy in patients with cirrhosis: a population-based study. Liver Int. 2010;30(2):275–83.
Nair M, Kurinczuk JJ, Knight M. Establishing a national maternal morbidity outcome indicator in England: a population-based study using routine hospital data. PLoS ONE. 2016;11(4):e0153370.
Lee Y, Roberts CL, Dobbins T, Stavrou E, Black K, Morris J, Young J. Incidence and outcomes of pregnancy-associated cancer in Australia, 1994–2008: a population-based linkage study. BJOG. 2012;119(13):1572–82.
Mehrabadi A, Liu S, Bartholomew S, Hutcheon JA, Magee LA, Kramer MS, Liston RM, Joseph KS. Hypertensive disorders of pregnancy and the recent increase in obstetric acute renal failure in Canada: population based retrospective cohort study. BMJ. 2014;349:g4731.
Benedetto C, Marozio L, Tavella AM, Salton L, Grivon S, Di Giampaolo F. Coagulation disorders in pregnancy: acquired and inherited thrombophilias. Ann N Y Acad Sci. 2010;1205(1):106–17.
Leung TY, Lao TT. Thalassaemia in pregnancy. Best Pract Res Clin Obstet Gynaecol. 2012;26(1):37–51.
Stavrou E, McCrae KR. Immune thrombocytopenia in pregnancy. Hem Oncol Clin. 2009;23(6):1299–316.
Haider BA, Olofin I, Wang M, Spiegelman D, Ezzati M, Fawzi WW. Anaemia, prenatal iron use, and risk of adverse pregnancy outcomes: systematic review and meta-analysis. BMJ. 2013;346:f3443.
Hadfield RM, Lain SJ, Cameron CA, Bell JC, Morris JM, Roberts CL. The prevalence of maternal medical conditions during pregnancy and a validation of their reporting in hospital discharge data. Aust N Z J Obstet Gynaecol. 2008;48(1):78–82.
Taylor LK, Travis S, Pym M, Olive E, Henderson-Smart DJ. How useful are hospital morbidity data for monitoring conditions occurring in the perinatal period? Aust N Z J Obstet Gynaecol. 2005;45(1):36–41.
Lain SJ, Roberts CL, Hadfield RM, Bell JC, Morris JM. How accurate is the reporting of obstetric haemorrhage in hospital discharge data? A validation study. Aust N Z J Obstet Gynaecol. 2008;48(5):481–4.
Patterson JA, Francis S, Ford JB. Assessing the accuracy of reporting of maternal red blood cell transfusion at birth reported in routinely collected hospital data. Matern Child Health J. 2016;20(9):1878–85.
Patterson JA, Roberts CL, Taylor LK, Ford JB. Reporting postpartum haemorrhage with transfusion: a comparison of NSW birth and hospital data. NSW Pub Health Bull. 2014;24(4):153–8.
Quantin C, Benzenine E, Ferdynus C, Sediki M, Auverlot B, Abrahamowicz M, Morel P, Gouyon JB, Sagot P. Advantages and limitations of using national administrative data on obstetric blood transfusions to estimate the frequency of obstetric hemorrhages. J Pub Health. 2013;35(1):147–56.
Australian Consortium for Classification Development, Australian Coding Standards for the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Australian Modification, Tenth Edition and the Australian Classification of Health Interventions. Australian Coding Standards for the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Australian Modification, Tenth Edition and the Australian Classification of Health Interventions. Australia.: Independent Hospital Pricing Authority; 2017.
Roberts CL, Bell JC, Ford JB, Morris JM. Monitoring the quality of maternity care: how well are labour and delivery events reported in population health data? Paediatr Perinat Epi. 2009;23(2):144–52.
Flood M, Pollock W, McDonald SJ, Davey MA. Accuracy of postpartum haemorrhage data in the 2011 Victorian Perinatal Data Collection: results of a validation study. Aust N Z J Obstet Gynaecol. 2018;58(2):210–6.
Flood MM, McDonald SJ, Pollock WE, Davey MA. Data accuracy in the Victorian Perinatal Data Collection: results of a validation study of 2011 data. Health Inf Manag J. 2017;46(3):113–26.
Davey MA, Sloan ML, Palma S, Riley M, King J. Methodological processes in validating and analysing the quality of population-based data: a case study using the Victorian Perinatal Data Collection. Health Inf Manag J. 2013;42(3):12–9.
Wacholder S, Armstrong B, Hartge P. Validation studies using an alloyed gold standard. Am J Epidemiol. 1993;137(11):1251–8.
Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, Bossuyt PM. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epi. 2009;62(8):797–806.
Dietz P, Bombard J, Mulready-Ward C, Gauthier J, Sackoff J, Brozicevic P, Gambatese M, Nyland-Funke M, England L, Harrison L, Taylor A. Validation of self-reported maternal and infant health indicators in the Pregnancy Risk Assessment Monitoring System. Matern Child Health J. 2014;18(10):2489–98.
Fottrell E, Byass P, Berhane Y. Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates. BMC Med Res Methodol. 2008;8(1):13.
Centre for Epidemiology and Evidence. New South Wales Mothers and Babies 2018. Sydney: NSW Ministry of Health; 2019.
National Blood Authority. Patient Blood Management Guidelines: Module 5‐Obstetrics and Maternity. 2015.
Baldwin H, Nippita T, Rickard K, Torvaldsen S, McGee T, Patterson J. Reporting of gestational diabetes and other maternal medical conditions: Validation of routinely collected hospital data from New South Wales, Australia. Int J Pop Data Sci. 2021;6(1):1–11.
Randall DA, Patterson JA, Gallimore F, Morris JM, McGee TM, Ford JB, Obstetric Transfusion Steering Group. The association between haemoglobin levels in the first 20 weeks of pregnancy and pregnancy outcomes. PLoS ONE. 2019;14(11):e0225123.
Chao A, Tsay P, Lin SH, Shau WY, Chao DY. The applications of capture-recapture models to epidemiological data. Stat Med. 2001;20(20):3123–57.
Lain SJ, Hadfield RM, Raynes-Greenow CH, Ford JB, Mealing NM, Algert CS, Roberts CL. Quality of data in perinatal population health databases: a systematic review. Med Care. 2012;50(4):e7–20.
Kim JY, Beckwith BA. The coming wave of change: ICD-10. J Pathol Inform. 2010;1:28.
Hadfield RM, Lain SJ, Cameron CA, Bell JC, Morris JM, Roberts CL. The prevalence of maternal medical conditions during pregnancy and a validation of their reporting in hospital discharge data. Aust NZ J Obstet Gyn. 2008;48(1):78–82.
The authors thank the Northern Sydney Local Health District and Western Sydney Local Health District and the hospitals for providing access to the data. We thank the personnel who linked and checked the records. This work was supported by the Sydney Medical School Kick Start Grant Program, and the Prevention Research Support Program, funded by the New South Wales Ministry of Health.
This work was funded by the Sydney Medical School Kick Start Grant Program and the Prevention Research Support Program, funded by the New South Wales Ministry of Health.
Ethics approval and consent to participate
This study received ethics approval from the Northern Sydney Local Health District Human Research Ethics Committee (LNR/17/HAWKE32).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
: Table 1. International Classification of Diseases, Tenth Edition, Australian Modification (ICD-10-AM) diagnostic and Australian Classification of Health Interventions, Eighth Edition (ACHI) procedure codes for anaemia and bleeding disorders used in this study. Table 2. Rates of diagnoses and procedures identified by hospital and data source, New South Wales, 2011-2015.
About this article
Cite this article
Baldwin, H.J., Nippita, T.A., Torvaldsen, S. et al. Validation of anaemia, haemorrhage and blood disorder reporting in hospital data in New South Wales, Australia. BMC Res Notes 14, 167 (2021). https://doi.org/10.1186/s13104-021-05584-x
- Blood disorders
- Platelet disorders
- Coagulation disorders