The need
The real examples in Tables 1 and 2 and Table S1–S3 [Additional file 2] show the dangers of neglecting interactions. In all these examples, the effects of one or both variants were completely masked by the interacting factor. For instance, in the meta-analysis of four BACE1 studies (Table 2 and Figure 3), the effect of the BACE1 exon 5 GG was hidden in the absence of APOE4 [pooled OR = 0.8 (95% CI: 0.6–1.1; p = 0.17), random effects model [17]], but revealed in its presence [1.9 (1.3–2.9; 0.0015)]. Tables S1–S3 [Additional file 2] give further examples of such masking.
There is a common view that interactions, e.g. between genes (epistasis), should only be examined between risk factors that have already shown a significant main effect. But in many cases, such as most of the above, the association would be missed by the traditional single-factor approach [1–3]. Indeed, this was so in most of the examples of significant epistasis uncovered in our recent survey of sporadic AD [18]. Out of 36 such examples, 34 with SFs ≥ 2, the main effects of the gene variants other than APOE4 were generally very weak. The ORs were ≤ 1.2 in 20 out of 36 cases and were only significant in 5 cases. Thus, preliminary screening for main effects will miss many, possibly most cases of epistasis.
On the other hand, synergy can be too easily claimed. A common misconception is that a high combined OR necessarily implies synergy. A single OR by itself says nothing about synergy; it is the relation between the three relevant ORs that matters. For instance, let us assume that two risk factors are associated with ORs of 3 and 5 alone and of 15 when combined. Although the combined value is impressive, there is no synergy: SF = 15/(3 × 5) = 1. Claims of synergy are frequently published on the basis of such invalid evidence. Indeed, we have noted at least 20 claims of interactions, in the field of AD genetics alone, that were published in leading journals in recent years, but which may be clearly refuted by SF analysis. There is thus a need for a readily accessible method of testing such claims.
Limitations of the SF method
We suggest that SF analysis, being based on logistic regression analysis, is best used for assessing binary interactions [2]. Various methods have been devised to examine higher order interactions [19, 20]. However, some have only limited value for purposes of interpretation. Moreover, nearly all case-control sample-sets currently used for association studies lack the power for the proper study of higher order interactions [18]. Where a third interacting factor is suspected and a sufficiently large dataset is available, SF analysis may be performed twice, after stratification by the third factor, e.g. gender.
Where the relevant data are available, logistic regression analysis is the appropriate method for adjusting for covariates, while SF analysis should be the preferred method for stratification by covariates. Stratification can produce very small subsets, even of zero, which logistic regression analysis cannot handle. In contrast, SF analysis produces a realistic p value in each subgroup, if one adds 0.5 to each cell in any 4 × 2 table with at least one zero cell [21, 22].
Advantages of the SF method
SF analysis is simple to perform, through the Excel programmes in Additional files 3 and 4. It is a matter of a few minutes to perform the analysis, e.g. to check a claim of synergy in a published paper. The value of the method may be seen in the study of Combarros et al 2008 [18], in which SF analysis was used to examine each of the 89 studies of interactions cited in that review. The method measures both the size and significance of a binary interaction, using either primary or summarised data. Unlike logistic regression analysis, it can be applied to datasets of any size, however small, even with zero cells (above). The method can be used with all types of susceptibility factors, both risk and protective, for instance, age, gender, diet, medication or genetic polymorphisms, provided the data are dichotomised, e.g. age ± 75 years. It can be applied both to synergistic and to antagonistic interactions. Novel features include power estimation (through an R function available from MCB) and meta-analysis, an increasingly important application (through the Excel programme in Additional file 4). Neither function has been readily available before.