A note on the use of the generalized odds ratio in meta-analysis of association studies involving bi- and tri-allelic polymorphisms

Background The generalized odds ratio (GOR) was recently suggested as a genetic model-free measure for association studies. However, its properties were not extensively investigated. We used Monte Carlo simulations to investigate type-I error rates, power and bias in both effect size and between-study variance estimates of meta-analyses using the GOR as a summary effect, and compared these results to those obtained by usual approaches of model specification. We further applied the GOR in a real meta-analysis of three genome-wide association studies in Alzheimer's disease. Findings For bi-allelic polymorphisms, the GOR performs virtually identical to a standard multiplicative model of analysis (e.g. per-allele odds ratio) for variants acting multiplicatively, but augments slightly the power to detect variants with a dominant mode of action, while reducing the probability to detect recessive variants. Although there were differences among the GOR and usual approaches in terms of bias and type-I error rates, both simulation- and real data-based results provided little indication that these differences will be substantial in practice for meta-analyses involving bi-allelic polymorphisms. However, the use of the GOR may be slightly more powerful for the synthesis of data from tri-allelic variants, particularly when susceptibility alleles are less common in the populations (≤10%). This gain in power may depend on knowledge of the direction of the effects. Conclusions For the synthesis of data from bi-allelic variants, the GOR may be regarded as a multiplicative-like model of analysis. The use of the GOR may be slightly more powerful in the tri-allelic case, particularly when susceptibility alleles are less common in the populations.


Findings
The generalized odds ratio (GOR) was recently suggested as a model-free measure of effect that might overcome the problem of a genetic model misspecification in meta-analyses of association studies [1]. In the context of case-control genetic association studies for a binary trait and under assumption of random sampling, the GOR measures the probability that a case has a higher mutation load (i.e. a larger number of high-risk alleles) than a control divided by the probability that a control has a higher mutation load than a case.
In this note, we highlight advantages and limitations of the use of the GOR as a measure of effect in meta-analyses of bi-and tri-allelic polymorphisms through simulation. Our results are further complemented by a re-analysis of a real meta-analysis of three genome-wide association studies covering >311,000 bi-allelic markers in Alzheimer's disease.

Performance of the GOR in the bi-allelic model Type-I error rates
Type-I error rates obtained from meta-analyses employing the GOR as a summary effect size are comparable to the multiplicative and dominant models of analysis (Table 1).

Power
Compared to the use of multiplicative approaches, the power to detect variants with a dominant model of action was typically only slightly higher for meta-analyses using the GOR as summary estimate. For variants following a multiplicative pattern of action, all non-recessive models of analysis were highly comparable. Interestingly, the largest differences observed among the per-allele, log-additive trend (LAT) and the GOR were found in true recessive and over-dominant models, where the performance of the GOR is slightly inferior for the former, but reasonable better for the latter ( Figure 1).

Bias in the estimated statistical heterogeneity (τ 2 )
Compared to both per-allele and LAT approaches, the median bias in τ 2 obtained by the GOR is typically lower in scenarios where the genetic variant is less common in the populations (e.g. minor allele frequency [MAF] = 10%) and acts either dominantly or multiplicatively. For the latter model of action, bias is slightly positive. In addition, for common markers (MAF = 40%) following a dominant model of action, the GOR provides less biased τ 2 estimates compared to the specification of a multiplicative model. Importantly, for a common variant (MAF = 40%) acting multiplicatively, meta-analyses using the GOR as an effect size provide upwardly biased estimates of τ 2 compared to true underlying average increment in the between-study variance per additional copy of the risk allele ( Figure 2). This upward bias in the estimated statistical heterogeneity is also found in both dominant and recessive models of analysis.

Bias in the estimated genetic effect size
The GOR provides nearly unbiased summary effects for less common variants (MAF = 10%) acting dominantly, regardless of the meta-analytical model and τ 2 . Conversely, when the variant follows a multiplicative model of action and is common (MAF = 40%), GOR-based metaanalyses overestimate the true underlying increase in the effect size per additional copy of the risk allele (on average 20%) [Additional file 1: Supplementary tables S1-S2]. Performance of the GOR in the tri-allelic model Type-I error rates The performance of each model of analysis depends on the underlying between-study variability, allele frequencies and meta-analytical model, but type-I error rates for LATand GOR-based meta-analyses are comparable, whereas false discoveries tend to be higher for the per-allele approach when statistical heterogeneity is present (i.e. τ 2 >0). However, the extent of these differences is smaller in random-effects calculations [Additional file 1: Supplementary tables S3-S4].
Power: two alleles acting on the same direction When at least one of the risk-alleles is less common in the populations (f = 10%), and both exhibit either a dominant or multiplicative mode of action, power obtained by using the GOR as a summary effect is The sample size for each study was randomly sampled from a uniform distribution on the interval [500-1000] and split equally into cases and controls (i.e. case to control ratio = 1). Color lines depict power estimates under different models of analysis: green (dominant), blue (per-allele odds ratio), red (recessive) and black (generalized odds ratio). Results for the log-additive trend were omitted because the striking similarity with the results of the per-allele odds ratio. Results are based on 5,000 replications under a random-effects model (DerSimonian-Laird method). f, allelic frequency. Scenarios with alternative magnitudes of heterogeneity or use of a fixed-effects model yielded qualitatively identical results.
higher than that provided by either the per-allele or LAT approaches ( Figure 3).

Power: two alleles acting on opposite directions
When prior evidence on the direction of the effects of the susceptibility alleles is available, similar power is achieved with the use of the per-allele, LAT and GOR, regardless of the meta-analytical model, f and statistical heterogeneity [Additional files 1: Supplementary tables S5-S7].
On the other hand, when no prior evidence on the direction of effects is available (e.g. initial screenings), the per-allele model of analysis displays a superior performance compared to the use of either the LAT-or GOR-based approaches. In particular, compared to both GOR and LAT approaches, the gain in power for metaanalyses using the per-allele OR may range from 1.5-to 10-fold depending on the number of combined studies ( Figure 4).

Power: when only one allele displays a significant effect
Power is comparable among the GOR, LAT and perallele odds ratio when only one allele displays a significant effect. This is specially true when the high-risk Figure 2 Bias (%) in the between-study variance (τ 2 ) estimate (statistical heterogeneity) for the bi-allelic case for a representative scenario of a variant with modest effect (OR = 1.3) following distinct modes of action (A-B, dominant; C-D, multiplicative, E-F, overdominant) under moderate heterogeneity (τ 2 = 0.025). The sample size for each study was randomly sampled from a uniform distribution on the interval [500-1000] and split equally into cases and controls (i.e. case to control ratio = 1). Color lines depict bias estimates under different models of analysis: green (dominant), blue (per-allele odds ratio), red (recessive) and black (generalized odds ratio). Results for the log-additive trend model of analysis were omitted because the striking similarity with the results of the per-allele odds ratio. Scenarios with a genuine variant acting recessively are not displayed for simplicity, since all models of analysis (except the recessive) are unable to capture the underlying τ 2 even when the frequency of risk alleles is high. Scenarios with alternative magnitudes of heterogeneity yielded qualitatively identical results. Results are based on the median value from 5,000 replications. f, allelic frequency.
allele is less common in the populations (f = 10%), particularly when f (A 2 ) = f (A3) = 10%. Overall, for common variants acting multiplicatively, the best performance is achieved with both GOR and LAT. When the risk allele is either recessive or dominant and is common, the best approach may depend on the frequency of the remaining alleles, but power is comparable among the three tested approaches whenever f (A 2 ) ≅ f (A3) [Additional file 1: Supplementary tables S8-S10].

Real application
Results for the seven "top hits" variants associated with late-onset Alzheimer's disease are presented in Table 2. As expected, the largest association signal arose from the variant rs41377151, located at the 3' end of the apolipoprotein C-I (APOC1) gene within the Apolipoprotein E (APOE)/APOC1 gene cluster on chromosome 19q13.3. This polymorphism is only 10.9 kb away from rs7412 variant (Arg176Cys) [2], which is one of the alleles that dictate the APOE ε status [3]. In addition, the remaining signals are also commensurate with results from previous [4] and more recent, large investigations [2,5,6].
In agreement with our simulation-based results, plots of summary ORs and P-values ( Figure 5) based on real data suggest a good concordance between GOR and both LAT and per-allele approaches, followed by the dominant and recessive models, respectively. Power (at a = 5%) for the tri-allelic case for a representative scenario of two alleles (A 2 and A 3 ) acting on the same direction with modest effect (OR = 1.3) following distinct modes of action (A-B, dominant; C-D, multiplicative, E-F, recessive) under moderate heterogeneity (τ 2 = 0.025). The sample size for each study was randomly sampled from a uniform distribution on the interval [500-1000] and split equally into cases and controls (i.e. case to control ratio = 1). Color lines depict power estimates under different models of analysis: orange (log-additive trend), blue (Dunn-Šidák-corrected per-allele odds ratio) and black (generalized odds ratio). Results are based on 5,000 replications under a random-effects model (DerSimonian-Laird method). f, allelic frequency. Scenarios with alternative magnitudes of heterogeneity or use of a fixed-effects model yielded qualitatively identical results.

Discussion
The GOR was suggested as a model-free approach for the synthesis of genetic association studies. The rational is that the GOR provides more flexibility for the true underlying genetic effect to describe the difference between two cumulative distribution functions of the latent variables, particularly when the assumption of proportional odds is violated. Furthermore, an additional advantage is that this ordinal measure of association is easily interpretable in practice [1].
Recent meta-analyses have applied the GOR claiming that this might be considered a different genetic model or an independent approach compared to the specification of traditional genetic model of analysis [7,8]. However, here we show that, since the GOR inherently assumes an ordinal mutation load (e.g. 1, 2 and 3 for genotypes A 1 A 1 , A 1 A 2 , and A 2 A 2 , respectively), this measure of assocation performs like a multiplicative model of analysis for bi-allelic polymorphisms. For diallelic variants, our simulations show that GOR-based results are Figure 4 Power (at a = 5%) for the tri-allelic case for a representative scenario of two alleles (A 2 and A 3 ) acting in opposite directions (A 2 is protective, whereas A 3 is the susceptibility allele) with modest effect (OR = 0.77 for allele A 2 and OR = 1.3 for allele A 3 ) following distinct modes of action (A-B, dominant; C-D, multiplicative, E-F, recessive) under moderate heterogeneity (τ 2 = 0.025). The sample size for each study was randomly sampled from a uniform distribution on the interval [500-1000] and split equally into cases and controls (i.e. case to control ratio = 1). Color lines depict power estimates under different models of analysis: orange (log-additive trend), blue (Dunn-Šidák-corrected per-allele odds ratio) and black (generalized odds ratio). Results are based on 5,000 replications under a random-effects model (DerSimonian-Laird method). f, allelic frequency. Scenarios with alternative magnitudes of heterogeneity or use of a fixed-effects model yielded qualitatively identical results.
highly correlated to those obtained by both LAT and per-allele ORs, resulting in similar type-I error rates and power compared to these traditional multiplicative models of analysis. In addition, a real meta-analysis of three GWAs in Alzheimer's disease indicates that gain from using the GOR method may be limited. For example, under a fixed-effects framework and assumption of a threshold of P<10 -5 (probably realistic due to the small samples sizes available), the total number of markers considered promising for further replication [9] would be 10, 13, 13, 14 and two for the per-allele, LAT, GOR, dominant and recessive approaches, respectively. Under Table 2 Summary results according to different models of analysis for the seven strongest association signals obtained by a meta-analysis of three independent genome-wide association studies in Alzheimer's disease (TGen data sets, Reiman    a random-effects model, the correspondent numbers would be two for the recessive model and 8 for the remaining approaches. Nonetheless, other important considerations in metaanalysis of genetic association studies involving bi-allelic polymorphism are biases in the estimated effect size [10] and heterogeneity [11]. In this respect, the most negative aspect of using the GOR as a measure of association in practice is that this measure provides inflated effects for bi-allelic variants following a multiplicative model of action. Although this inflation may be only mild for less common markers (i.e. median bias of 5% for variants with MAF = 10%), the average upward bias in the observed genetic effect augments with increasing MAFs, reaching up to 20% for MAFs around 40%.
( ) On the other hand, our data showed that the use of the GOR may be advantageous in meta-analyses involving tri-allelic polymorphisms as long as genotypes can be correctly ordered in terms of mutation load. In fact, a reasonable gain in power in the order of 2 to 15% may be achieved for the detection of association signals from variants with small frequencies (e.g. f~10%) compared to the use of per-allele or LAT odds ratios. The observation that higher power might be obtained with GOR in scenarios with a larger number of alleles of low frequency may serve as hypothesis-generating information to extent the use of the GOR to meta-analysis of different types of genetic variants. For example, a special case might the use of the GOR in meta-analysis of structural variants such as copy-number variations (CNVs), which tend to exhibit a substantial number of alleles, yielding a correspondent large number of possible genotype categories [12]. Since the GOR handles categories with zero counts [13], and a different number of genotypes may be considered per study (for instance, in the case of specific allele sizes in some populations), the properties of the GOR in meta-analysis of CNVs is a topic worth of further investigation.
In summary, although there are differences in the statistical properties among the investigated approaches for bi-allelic variants, the absolute magnitude of these differences may be actually small and likely to be of very limited practical significance. An exception might be the use of the GOR in meta-analyses involving tri-allelic polymorphisms with less common alleles, since GOR uses of the complete genotypic distribution (e.g. the GOR less affected by zero cells). For these scenarios, the use of the GOR as a measure of effect may be slightly more powerful than traditional measures. However, the performance of GOR-based meta-analyses will depend on some knowledge about the direction of the effects when there are two alleles modulating the risk of disease in opposite directions.

Simulation procedures and scenarios
We simulated meta-analyses of association studies using approaches that rely on multinomial distributions described in detail elsewhere (autosomal markers) [9,10]. Hardy-Weinberg equilibrium is assumed to hold for the whole population, whereas the susceptibility alleles are considered the causal variants or surrogate markers in tight linkage disequilibrium (r 2 = 1.0). For the bi-allelic case, we simulated data assuming the susceptibility variant A 2 (minor allele) and non-risk allele A 1 .
Under a three-allele model, we denote A 1 , A 2 and A 3 as the possible alleles with frequencies f(A 1 ), f(A 2 ) and f (A 3 ) , respectively, yielding six possible distinct genotypes For each possible combination of the parameters presented in Table 3 we considered meta-analyses that included two up to 30 studies (case-to-control ratio of 1:1).
For the tri-allelic case, three possible scenarios were considered: (i), among the alleles, two were susceptibility variants (e.g. both increase the susceptibility for the trait with the same magnitude), (ii) two alleles were associated with the trait, but in opposite directions (i.e. one increases, while the other decreases the risk for the trait in a similar magnitude) and (iii) only one out of the three alleles displays significant effects on the trait. We further assumed that the mechanism of action is similar for both alleles when there are two alleles with genuine effects on the trait (e.g. both act multiplicatively, or both act dominantly, and so forth). For scenarios with two alleles modulating the risk of disease, two additional situations of practical interest were investigated: (ii-a) the two alleles are associated with the susceptibility of disease in opposite directions and investigators have no prior evidence on the direction of these effects (e.g. initial agnostic screenings) and (ii-b) two alleles are associated with the susceptibility of disease in opposite directions, but investigators posses prior evidence on the direction of the effects (e.g. meta-analyses from the literature). For consistency, allele A 2 is coined to be the protective variant, whereas allele A 3 is the susceptibility one in these scenarios.

Bi-allelic polymorphisms Assessment of bias
The percentage bias was computed as θ -μ μ × 100 and τ 2 − τ 2 τ 2 × 100 for genetic effect sizes and betweenstudy variance, respectively, where θ is the (average) observed summary effect, μ is the true average genetic effect across population-specific genetic effects, τ 2 is the true between-study variance and τ 2 is the method-ofmoments-based estimate of τ 2 . Both θ and μ are captured as the natural logarithm of the odds ratio ( Table 3). Use of alternative bias estimators (e.g. mean squared error) yielded qualitatively analogous results (data not shown).

Tri-allelic polymorphisms
Meta-analyses involving three-allele polymorphisms may rely on a diversity of approaches to summarize effects across studies. However, because the assumption of multiplicative effects yields, on average, the lowest rates of false-positive results in bi-allelic markers [9,10]

The generalized odds ratio
For a binary trait (e.g. case-control studies), GOR measures the probability that a randomly sampled case has a genotype with a higher mutation load (i.e. a larger number of high-risk alleles) than a randomly sampled control divided by the probability that a randomly sampled case has a genotype with lower mutation load than a randomly sampled control [1]. The GOR for a binary trait and an m-allelic variant can be computed as [13]: where J is the total number of genotypes (categories) given the number of alleles, i.e., J = m(m+1)/2, m is the number of alleles, p j|i = n j|i J j=1 n j|i (i.e. the proportion of the subjects with genotype j, for j = 1,..,J, in which the higher the value of j, the higher the mutation load) in the group i (i = 0 or 1 for controls and cases, respectively). In the present investigation, the large-sample variance for GOR was computed from the asymptotic standard error of the Goodman-Kruskal γ [1]. Stata and R codes to compute the GOR and its large-sample variance are available from the first author upon request.

Mutational load order
The order of the jth genotypic category (i.e. mutational load) for the GOR and log-additive trend is anticipated to impact statistical power. Hence, for the situation ii-a ii-b (meta-analyses from the literature with prior information on the direction of effects).

Assessment of power and type-I error
Empirical power and type-I error rates (i.e. false-positive discoveries) were computed as the proportion of simulations that gave a two-sided P-value < 5%. Because there are three correlated OR estimates for the tri-allelic case for the per-allele model, we corrected the α level using the Dunn-Šidák procedure. Specifically, power and type-I error rates for the per-allele model (tri-allelic case) were computed as the proportion of the simulations that gave one or more P-values < α corrected , where α corrected = 1 − 3 (1 − α).

Real application
We compared results based on the GOR as a summary effect to those obtained by usual approaches of model specification in a real meta-analysis of three independent genome-wide studies in late-onset Alzheimer's disease. After standard control measures, a total of 311,915 bi-allelic polymorphisms were scored in 1411 participants (961 cases and 560 controls). Detailed description on the samples, genotyping platforms and diagnostics criteria are available elsewhere [4]. Results from individual studies were corrected for residual inflation of the test statistic using genomic control methods [14].

Meta-analysis methods
Meta-analyses were carried out under both fixed-and random-effects models, represented by the general Recessive OR, true underlying odds ratio. f, allelic frequency for the risk allele. τ 2 , between-study variance. N, number of participants per study. GOR, generalized odds ratio. LAT, log-additive trend. The size for each study was randomly sampled from a uniform distribution on the interval [500-1000] and split equally into cases and controls (i.e. case to control ratio = 1).
All simulations were performed in Stata 11.1 package (Stata Corporation), whereas the meta-analysis of real data sets were carried out in PLINK [18].

Additional material
Additional file 1: Supplementary tables S1 through S10.