Intracluster correlation coefficients for sample size calculations related to cardiovascular disease prevention and management in primary care practices

Background Few studies have comprehensively reported intracluster correlation coefficient (ICC) estimates for outcomes collected in primary care settings. Using data from a large primary care study, we aimed to: a) report ICCs for process-of-care and clinical outcome measures related to cardiovascular disease management and prevention, and b) investigate the impact of practice structure and rurality on ICC estimates. Methods We used baseline data from the Improved Delivery of Cardiovascular Care (IDOCC) trial to estimate ICC values. Data on 5,140 patients from 84 primary care practices across Eastern Ontario, Canada were collected through chart abstraction. ICC estimates were calculated using an ANOVA approach and were calculated for all patients and separately for patient subgroups defined by condition (i.e., coronary artery disease, diabetes, chronic kidney disease, hypertension, dyslipidemia, and smoking). We compared ICC estimates between practices in which data were collected from a single physician versus those that had multiple participating physicians and between urban versus rural practices. Results ICC estimates ranged from 0 to 0.173, with a median of 0.056. The median ICC estimate for dichotomous process outcomes (0.088) was higher than that for continuous clinical outcomes (0.035). ICC estimates calculated for single physician practices were higher than those for practices with multiple physicians for both process (average 3.9-times higher) and clinical measures (average 1.9-times higher). Urban practices tended to have higher process-of-care ICC estimates than rural practices, particularly for measuring lipid profiles and estimated glomerular filtration rates. Conclusion To our knowledge, this is the most comprehensive summary of cardiovascular-related ICCs to be reported from Canadian primary care practices. Differences in ICC estimates based on practice structure and location highlight the importance of understanding the context in which external ICC estimates were determined prior to their use in sample size calculations. Failure to choose appropriate ICC estimates can have substantial implications for the design of a cluster randomized trial.


Background
Cluster randomized trials are increasingly being used in primary health care research [1]. In a cluster randomized trial, groups of individuals (e.g., primary care practices, hospitals, communities), rather than individual patients themselves, are randomly allocated to either an experimental or control intervention. Cluster randomization is required when interventions are necessarily delivered at the group or cluster level, such as an initiative that introduces specialist nurses into primary care practices. In cases where an intervention is delivered on an individual level, cluster randomization may be preferable due to logistical, practical, or scientific reasons [2].
It has become well known that cluster randomized trials are statistically less efficient than trials using individual randomization, as individual responses within a cluster are usually positively correlated [2,3]. The degree of correlation is usually measured by the Intracluster Correlation Coefficient (ICC). To account for intracluster correlation, sample sizes required for cluster randomized trials must be increased to reach the desired power [2]. Sample size calculation formulas for cluster randomized trials are widely available [2,[4][5][6].
One method to account for clustering in sample size estimates, assuming constant cluster sizes, involves using the variance inflation factor (VIF) [2]. The VIF for a cluster randomized trial with a simple parallel design is a function of the cluster size (m) and the intracluster correlation coefficient (ICC) which is denoted by ρ: VIF = 1 + (m -1)ρ. The ICC represents the proportion of variance in a given outcome that can be explained by the variation between clusters, and is given by ρ = σb2/(σb2 + σw2), where σb2 is the between cluster variance and σw2 is the within cluster variance. An alternative expression for the VIF when the cluster sizes vary is provided by Donner, Birkett and Buck [7].
To conduct sample size calculations for a planned cluster randomized trial, advance estimates of the ICC are required. Estimates are often made available from previously published reports of cluster randomized trials evaluating similar outcomes. Despite recommendations to report ICC values in published reports of cluster randomized trials, [8] it is still challenging to find applicable ICC values to aid in the design of future trials [9].
One concern in using external ICC estimates is whether they are appropriate for the planned cluster randomized trial. If an inaccurate estimate for the ICC is used, the resulting sample size estimate may be either too large or too small. Several studies have analyzed determinants of ICCs [9][10][11][12]. Choosing appropriate estimates for ICCs is of particular concern in primary care research, where practice characteristics often vary widely across various domains, including rurality, physician remuneration model, and practice structure (i.e., solo or multiple physician practices).
We recently conducted a large primary care quality improvement initiative in 84 primary care practices across Eastern Ontario, Canada [13]. Through this initiative, we have collected data from 5,140 patients who either have, or are at high risk of developing, cardiovascular disease. The main objective of this paper is to use this rich dataset to: a) report ICC values for a range of process of care and clinical outcome measures related to cardiovascular disease management and prevention, and b) investigate differences in ICC values based on: i) number of physicians per practice (i.e., solo versus multiple physician practices) and ii) urban versus rural practices. Understanding the potential impact that these practice characteristics have on ICC values is important, as a failure to take such factors into account can lead to studies that are inadequately powered.

Improved delivery of cardiovascular care (IDOCC) through outreach facilitation trial
The Improved Delivery of Cardiovascular Care (IDOCC) through Outreach Facilitation trial was designed as a stepped wedge cluster randomized control trial to support 84 primary care practices in improving their delivery of evidence-based cardiovascular care for patients at high risk [13]. IDOCC used trained facilitators who worked with practices for 12-24 months to incorporate elements of the chronic care model into daily practice routines to improve secondary preventive care for heart disease, stroke, peripheral vascular disease, chronic kidney disease, and diabetes. Primary care practices were located throughout the Champlain Local Health Integration Network (Ottawa and its surrounding communities) of Ontario, Canada, a culturally diverse region with a population of 1.2 million people who have chronic disease burdens and patient health outcomes that are comparable to Ontario and the rest of Canada. Canada has a publicly funded universal health insurance system, which is often referred to as "Medicare". Detailed information about the recruitment, participants, and data collection can be found elsewhere [13].
In brief, all practices within the Champlain Local Health Integration Network were invited to participate in IDOCC. The Champlain Local Health Integration Network was systematically divided into nine geographic regions, which were grouped together into strata by their location (i.e., west, central, and east). A computer generated randomization approach was used to assign each region within each stratum into one of the three steps of the stepped wedge design. Practices were enrolled in the trial if at least one physician from the practice agreed to participate. In total, 194 physicians in 93 practices were enlisted to participate, with nine practices dropping out prior to the initiation of the study. Data were abstracted by chart auditors from a random sample of patient medical charts from participating practices, in order to assess each practice's adherence to evidence-based guidelines for CVD care. Eligible patients for the chart audit were those over 40 years of age who met at least one of the following criteria: (1) CVD including coronary artery disease, cerebrovascular disease (documented stroke and/or transient ischemic attack), or peripheral vascular disease; (2) diabetes mellitus; (3) chronic kidney disease; and/or (4) be at high risk for CVD based on the presence of at least three of the following cardiovascular risk factors: age (males ≥ 45, females ≥ 55), smoker, hypertension, and dyslipidemia. ICC estimates presented in this paper were calculated from baseline data collected from 5,140 eligible patients (average cluster size of 61 patients/practice; range: 18 to 66). The mean cluster sizes were similar across the 9 regions, ranging from 60 to 65. In the IDOCC study, patient identifiers were available to uniquely link patients to specific practices, but not to specific physicians within practices. In group practices, it is not uncommon for a patient to be seen by multiple physicians in cases where their primary family physician is unavailable. Also, many group practices have a nurse on staff that performs certain clinical measures (e.g., blood pressure, waistline, weight, etc.) for all patients that are treated in a given practice. As such, inferences were to be made with respect to the practice, rather than individual physicians. For the purpose of presenting ICC estimates, we therefore calculated ICCs within practices. A similar approach was used in several other studies which presented ICC estimates in primary care settings [9,14,15].
Chart auditors collected data related to recommendations from the Champlain Primary Care Cardiovascular Disease Prevention and Management Guideline [16]. Data were collected across four domains: 1) cardiovascular disease/ risk factor screening, 2) drug prescriptions related to CVD, 3) referral to external programs (e.g., referral to smoking cessation program), and 4) clinical test results (e.g., blood pressure readings, lipid profiles, etc.). Process of care data assessed whether recommended care manoeuvres were performed, discussed, or recommended and were recorded as dichotomous indicators, while clinical outcome data were continuous. The Ottawa Hospital Research Ethics Board approved this study (2007292-01H).

Data analysis
ICC estimates were calculated using an ANOVA approach in which each of the nine geographic regions within the Champlain Local Health Integration Network was treated as fixed strata, corresponding to the study design [17]. ICC estimates were calculated for all patients and separately for patient subgroups defined by condition (i.e., coronary artery disease, diabetes, chronic kidney disease, hypertension, dyslipidemia, and smoking). We compared ICC estimates between practices in which data were collected from a single physician versus those that had multiple participating physicians and between urban versus rural practices. Negative ICC estimates were attributed to sampling error and set to zero [18]. It should be noted that although there are situations that can give rise to true negative ICCs (e.g., when there is competition between clusters), ICCs in the context of cluster randomized trials are generally expected to be positive. ICC estimates may however be negative simply due to chance, particularly when ICC values are close to 0. SAS 9.3 was used for all analyses. Table 1 provides a breakdown of the practice and patient profiles. The 84 participating practices were diverse, varying in practice team structure, physician remuneration approach, and rurality ( IDOCC, and were thus included in the solo physician group for the purposes of this analysis (i.e., 50 solo physicians, 34 multiple physician practices). Patients from such practices had the consenting physician as their main provider; whereas patients from group practices with multiple consenting physicians had different main providers. It is possible that patients in a group practice may be seen by multiple physicians within the same practice, but this likely represents only a small percentage of visits. ICC estimates obtained from all 84 primary care practices for process of care and clinical outcomes are presented in Tables 2 and 3 respectively. ICC estimates ranged from 0 to 0.173, with a median of 0.056 (Q1 to Q3, 0.025 to 0.094). The median ICC value for dichotomous process outcomes (0.088) was higher than that for continuous (0.035) clinical outcomes. The largest ICC estimates were for process of care measures looking at waistline measurement (0.173), ACR screening for patients with diabetes (0.167), and two blood pressure readings for patients with chronic kidney disease (0.157).

Results
In general, ICC values were fairly similar across different patient conditions. This may not be surprising as there is substantial overlap between some patient condition subgroups as can be seen from Table 1. For process of care measures (Table 2), medication prescribing across cardiovascular related conditions (i.e., CAD, diabetes, hypertension, and dyslipidemia) tended to have low ICCs (0.01 to 0.04), while ICCs for measuring blood pressure at least twice a year was above 0.1 across all applicable conditions ( Table 2). For clinical outcomes (Table 3), all ICC measures were below 0.1, with the exception of measures for diastolic blood pressure for patients with coronary artery disease (0.12) and chronic kidney disease (0.12). Tables 4 and 5 compare ICC values between practices in which data were collected from a single physician (50 practices) versus multiple physicians (34 practices) for process of care and clinical outcomes, respectively. In general, ICC estimates for process of care measures arising from single physician practices were higher than those arising from multiple-physician practices. ICC values collected from a single physician were on average 3.9 times higher (median: 2.7 times higher) than the value collected from multiple-physician practices, with a maximum difference of 11 times greater for two blood pressure measures per year for patients with diabetes. A similar trend was seen for clinical outcomes as ICC estimates for single physician practices was on average 1.9 times higher (median: 1.8 higher) than multiple physician practices. The largest differences in ICC estimates among all the clinical outcomes were for systolic and diastolic blood pressure measures Table 6 compares differences in ICC values between urban (N = 70) and rural (N = 14) practices for process of care outcomes. In general, urban practices had higher ICC estimates for process of care indicators, particularly for lipid profile and eGFR measures. In terms of clinical outcomes, ICC estimates were similar between urban and rural practices (results not shown).

Sample calculation
We now present a hypothetical example to illustrate the potential impact of using inaccurate ICC estimates on the design of a cluster randomized trial. Consider a planned cluster randomized trial aimed at improving the delivery of evidence-based care for high risk patients with hypertension. The primary outcome for the trial is patient systolic blood pressure. The target population consists of solo physician primary care practices. The study is being planned to detect a 5 unit mean difference in systolic blood pressure between intervention and control practices using a two-sided test at the 5% level of significance with 90% power, assuming an average cluster size of 20 patients per practice. If the estimated ICC and SD for multiple physician practices were used from

Discussion
To the best of our knowledge, this is the first article to present ICC values for a range of cardiovascular-related outcomes collected from primary care practices in Canada. The ICC values derived from the IDOCC study are in the range of previously reported estimates that have been obtained from primary care settings [1,14,[19][20][21]. In general, ICC values for process of care variables were higher than those for clinical outcomes, which is consistent with findings from other studies [10,22]. Intuitively, this is not surprising as clinical outcomes within a given cluster have a greater potential for variability, as each patient will have different levels of compliance and responses to a given treatment.
ICC estimates related to medication prescriptions/recommendations tended to have smaller values than ICCs for other process of care indicators, reflecting little variability amongst practices in prescribing medications for high risk patients. As has been shown in other primary care studies, there is clear evidence and physician agreement on the importance of prescribing medications to high risk patients with cardiovascular disease, diabetes, hypertension and/or dyslipidemia [23][24][25][26]. On the other hand, the largest ICC values amongst the process of care indicators were for waistline measurement, two blood pressure screenings per year, and ACR measurement, reflecting relatively high between practice variability for these measures. Unlike medication prescribing for high risk patients, there is likely less agreement amongst physicians regarding the need and/ or appropriateness of doing these process manoeuvres. For example, although the importance of waistline measurement has been widely publicized in predicting all-cause and cardiovascular-related mortality, [27] previous studies have demonstrated that some physicians do not do this screening test due to a lack of time, extra workload and financial implications, while others feel uncomfortable measuring waists or are concerned that patients might get embarrassed [28].
Overall, ICC levels for clinical outcomes were relatively low (<0.1) with the exception of values for diastolic blood pressure. This finding is in line with findings presented by Parker et al. [19], which found diastolic blood pressure measurements to have the largest ICC value from a group of clinical outcome markers  *whether specified quality of care indicator was discussed, recommended, or performed during a one year timeframe. eGFR -Estimated Glomerular Filtration Rate; ACR -Albumin-to-Creatinine ratio; HbA1c -Hemoglobin A1c.
collected from primary care practices in Rhode Island and Southeastern Massachusetts. The high ICC value likely reflects the hypertension management style of individual physicians [19].  The differences in ICC values between single versus multiple physician practices and those between urban versus rural practices highlight the importance of understanding the context in which ICC estimates are derived before using them in sample size calculations. It can be challenging to find an ICC estimate that was derived from a population and setting that matches the planned study. Researchers may have no choice but to use whatever ICC estimates are available for the outcome of interest, with the assumption that differences in context may only moderately impact sample size estimates. Using an ICC estimate that is too small can result in a substantially underestimated sample size and prevent drawing any definitive conclusions about the results of an intervention. Therefore, it is important to closely examine published ICC estimates to determine whether they are relevant to the planned trial. At the same time, when reporting an ICC estimate, researchers should clearly describe characteristics of the practices included in the trial. Our estimates pertain to the primary care setting; Campbell et al. [10] found that ICCs were significantly higher for secondary care outcomes compared with primary care outcomes.
There were several limitations in this study. Since practices in this analysis consented to take part in the IDOCC study, there is a potential selection bias. Practices that opted to participate in IDOCC are likely more highly motivated and higher performing than provincial averages. As such, it is possible that participating practices have inherent similarities that could decrease between practice variance and result in ICC values that are too small. However, considering that many primary care studies are voluntary, the estimates presented in this paper are likely representative.
In addition, the data for this study were collected from practices across Eastern Ontario only. As such, the results may not be generalizable to primary care studies conducted in jurisdictions with healthcare systems that are very different than in Ontario. However, ICC estimates found in this study were in line with values presented in primary care studies conducted in different countries [14,19].
We have used a simple one-way ANOVA to calculate ICC estimates. This method is commonly used to calculate ICCs for any type of outcome but the resulting value must be interpreted with care if the assumptions of analysis of variance are violated. Eldridge et al. [29] review several alternative definitions of ICCs in cluster randomized trials. Finally, large studies are required to estimate ICCs with a reasonable degree of accuracy. Although our main estimates are based on a relatively large sample size, we have not provided confidence intervals around our estimates. Readers using these ICCs for sample size calculation may therefore need to consider the degree of uncertainty associated with these estimates; in particular, the number of rural practices (N = 14) examined in this study was relatively small. Ukoumunne [30] and Zou and Donner [31] review confidence interval methods for ICCs in CRTs.

Conclusions
To the best of our knowledge, this article presents the most comprehensive summary of ICC values related to cardiovascular-related outcomes collected from Canadian primary care practices. The ICC estimates presented in this study cover a wide range of conditions and risk factors that can be used to aid in the design of future cluster randomized trials in primary care settings. Furthermore, we observed substantial differences in ICC estimates obtained from single physicians versus a group of physicians; this demonstrates the importance of understanding the context in which ICC values are determined before using them in sample size calculations. Failure to take these differences into account can have substantial implications for the design of a cluster randomized trial.