Distribution of allelic and genotypic frequencies of IL1A, IL4, NFKB1 and PAR1 variants in Native American, African, European and Brazilian populations

Background The inflammatory response plays a key role at different stages of cancer development. Allelic variants of the interleukin 1A (IL1A), interleukin 4 (IL4), nuclear factor kappa B1 (NFKB1) and protease-activated receptor 1 (PAR1) genes may influence not only the inflammatory response but also susceptibility to cancer development. Among major ethnic or continental groups, these polymorphic variants present different allelic frequencies. In admixed populations, such as the Brazilian population, data on distribution of these polymorphisms are limited. Here, we collected samples of cancer-free individuals from the north, northeast, midwest, south and southeast regions of Brazil and from the three main groups that gave rise to the Brazilian population: Native Americans from the Brazilian Amazon, Africans and Europeans. We describe the allelic distributions of four IL1A (rs3783553), IL4 (rs79071878), NFKB1 (rs28362491) and PAR1 (rs11267092) gene polymorphisms, which the literature describes as polymorphisms with a risk of cancer or worse prognosis for cancer. Results The genotypic distribution of the four polymorphisms was statistically distinct between Native Americans, Africans and Europeans. For the allelic frequency of these polymorphisms, the Native American population was the most distinct among the three parental populations, and it included the greatest number of alleles with a risk of cancer or worse prognosis for cancer. The PAR1 gene polymorphism allelic distribution was similar among all Brazilian regions. For the other three markers, the northern region population was statistically distinct from other Brazilian region populations. Conclusion The IL1A, IL4, NFKB1 and PAR1 gene polymorphism allelic distributions are homogeneous among the regional Brazilian populations, except for the northern region, which significantly differs from the other four Brazilian regions. Among the parental populations, the Native American population exhibited a higher incidence of alleles with risk of cancer or worse prognosis for cancer, which can indicate greater susceptibility to this disease. These genetic data may be useful for future studies on the association between these polymorphisms and cancer in the investigated populations.


Background
External factors that modify tissue homeostasis, such as microorganism infection, tissue injury and exposure to contaminants may induce inflammatory processes [1]. Chronic inflammation is associated with the appearance of malignant cells and these cells' proliferation, invasion and metastasis processes [2].
The inflammatory response may be modulated through gene expression variations due to structural genetic polymorphisms or regulatory regions. Therefore, an individual genetic composition may influence not only the inflammatory response but also susceptibility to cancer development [3,4].
The gene for the pro-inflammatory cytokine IL1A presents an insertion/deletion (INDEL) polymorphism (3′UTR indel TTCA, rs3783553) in a binding site for the microRNAs miR-122 and miR-378 that modifies the connection between these microRNAs. The insertion (Ins) allele is related to higher gene expression [13]. Different investigations have demonstrated that individuals homozygous for the insertion (Ins/Ins) are less susceptible to developing gastric cancer, nasopharyngeal cancer, hepatocellular carcinoma and cervical cancer [13][14][15][16].
Interleukin-4 (IL4), which is an anti-inflammatory cytokine gene, presents a VNTR polymorphism (intron 3 VNTR-70 bp, rs79071878) located on the third intron. Different alleles for this polymorphism modulate gene expression. The allele with two repeats (A2) is related to higher gene expression, while the allele with three repeats (A3) is related to lower expression [17,18]. A homozygous genotype for higher IL4 expression (A2/A2) is associated with a worse prognosis in bladder cancer patients and with oral and pharyngeal cancers [19,20].
Transcription factor NFKB1 plays an important role in the inflammatory process, and it can influence cancer development and aggressiveness, increasing tumour angiogenesis and repressing the immune response [21]. The NFKB1 gene carries an INDEL polymorphism in the promoter region (−94 indel ATTG, rs28362491) that exerts a regulatory effect on gene expression. The insertion allele is associated with an increase in promoter activity and protein synthesis [22].
Several studies on the association between rs28362491 and the risk of developing cancer show conflicting results [9,[23][24][25]. In a recent meta-analysis of 21 case-control studies, Yang et al. [21] showed that the insertion allele for this polymorphism is significantly associated with a risk of developing oral, prostate and ovary cancers. The analyses also reveal similar results in Asian populations, but not in European populations, which suggests ethnic variations in predisposition to different types of cancer.
The receptor PAR1 is a member of the superfamily of G protein-coupled membrane receptors [26]. These receptors regulate processes ranging from vascular integrity to systemic inflammation. Activation of the PAR1 receptor in epithelial cells, macrophages and endothelial cells promotes the release of pro-inflammatory mediators, such as TNF, IL1β, IL2, IL6, CXCL8 and CCL2 [27]. The PAR1 gene features an INDEL polymorphism (−506 indel-13 bp, rs11267092) located in the promoter region [28]. The deletion allele (Del) is related to a better prognosis in breast, stomach and oesophagus cancers [29][30][31].
The polymorphisms described above exhibit common features. (1) They are functional polymorphisms that alter the expression of genes that participate in and for metabolic pathways associated with carcinogenesis. (2) Such genes are associated with different types of cancer with high incidence in the Brazilian population, especially prostate, stomach, oesophagus, breast and cervical cancers [32]. (3) The genes vary in populations with different ethnic and geographic origins [33]. (4) Information on allele distributions in Native Americans and admixed populations, such as the Brazilian population, is limited. (5) From a technical perspective, all of the investigated polymorphisms are small INDELs; therefore, genotyping may be performed using a single PCR step followed by capillary electrophoresis, which is an accessible and lowcost laboratory technique.
The Brazilian population is one of the most heterogeneous populations worldwide that is formed by an admixture of Native Americans, Europeans and Africans. This heterogeneity has been well-documented in several genetic investigations [34][35][36][37][38][39]. The admixture process occurred through different means between the Brazilian geographic regions. Therefore, the Native American contribution is more pronounced in Northern Brazil; the African contribution is more elevated in Northeast Brazil; and the South features a European predominance with little Native American and African influence [39].
We must know the distribution of allelic frequencies of these polymorphisms [IL-1A (rs3783553), IL4 (rs79071878), NFKB1 (rs28362491) and PAR1 (rs11267092)] for association studies on cancer, data for which is limited on Brazilian populations. Thus, the aim of this study is to characterize allelic distributions in a representative sample of the Brazilian population from all geographic regions of the country.

Results
The allele and genotype frequencies of the investigated polymorphisms (rs3783553, rs79071878, rs28362491 and rs11267092) in samples from the five Brazilian geographic regions and in the parental populations are presented in Table 1. The genotypic distributions of the markers exhibited a Hardy-Weinberg equilibrium in the investigated populations, except for the NFKB1 gene marker in the European sample (p = 0.003).
When we compared allelic distributions in parental populations, all markers exhibited a significantly different allele frequency between the three groups, and the Native American populations exhibited more differences from the continental populations (Table 2). Among the four markers, the average frequency difference (value δ) between the Native Americans and Africans was 34 %; between the Native Americans and Europeans was 37 %; and between the Europeans and Africans was 20 %.
For the geographic region comparisons (Table 2), rs11267092 (PAR1 gene) showed no significant difference between the Brazilian regions. The distributions of the other three polymorphisms (rs3783553, rs79071878 and rs28362491) were statistically similar between the northeast, south and southeast regions.
The analyses showed statistically significant differences in the rs3783553, rs79071878 and rs28362491 polymorphism distributions between the northern population and the populations in the other regions, except the rs3783553 polymorphism in the midwest region (p = 0.543). This polymorphism exhibited a significantly different distribution in the midwest population compared with the northeast, south and southeast regions. All polymorphisms investigated here have been previously described as associated with a predisposition to some form of cancer [14,15,21,24,25], as well as compared with the prognosis of the disease [19,[29][30][31]. We proposed a different approach to analysing the population data. Under this new approach, the proportion of individuals who are allele carriers are considered at risk for cancer or worse prognosis for cancer (referred to as potentially deleterious alleles [40]), including the deletion allele for rs3783553, allele A2 for rs79071878, the insertion allele for rs28362491, and the insertion allele for rs11267092.
Based on this analysis, the proportion of potentially deleterious alleles is higher in the Native American population (50 %) than the African (44 %) and European (35 %) populations. Among Brazilians, the proportion of potentially deleterious alleles is 40 %. Likewise, the proportion of individuals who are carriers of six or more alleles associated with cancer is greater among the Native Americans (8.6 %) than the Africans (5.2 %) and Europeans (1.9 %), and the proportion of Brazilians

Discussion
This is the most comprehensive study on the rs3783553, rs79071878, rs28362491 and rs11267092 polymorphism variations in Brazil and among its ancestral populations. This is also the first study to describe the variability of these markers in Native American populations.
We evaluated the distribution of allele and genotype frequencies for these four markers among the pairs of ancestral and Brazilian populations using the Chi-square test with the due statistical corrections (false discovery rate) to avoid spurious correlations. This test is adequate to compare two or more populations for a qualitative variable [41]. Furthermore, our sample size is representative of the populations studied here (928 Brazilians, 222 Native Americans, 211 Africans and 268 Europeans) and adequate for using INDEL-type polymorphisms, which present a low mutation rate compared with other types of polymorphisms [42].
The data from the ancestral populations investigated here reveal considerable heterogeneity between continental populations. All investigated markers present δ values greater than 30 %, except for the NFKB1 gene polymorphism. In this set of markers, the Native American populations were the most differentiated among the investigated continental populations.
Given the differences in the observed allelic distribution between the parental populations, we tested the hypothesis that these differences were due to population sampling. Therefore, we used data published in the 1000 Genomes project (Table 3). We verified that the cited frequency for the IL1A, NFKB1 and PAR1 markers (the three markers for which data were available) in the African and European populations are similar to the frequency observed herein. Thus, we discard the hypothesis and assume that our samples are representative of the European, African and Native American populations. Despite the frequency differences between the parental populations and the intense and heterogeneous process of admixture that formed the current Brazilian population, the allelic distribution is relatively homogeneous throughout most of the country. In general, only the northern population exhibits an allele distribution that significantly differs from the other geographic regions for three of the four investigated markers.
We understand that these differences may be explained by a greater contribution of Native Americans to the northern populations. Previous estimates using different types of genetic markers show that the greatest Native American genetic contribution among Brazilians occurs in North populations [36,38,39,43]. Moreover, data generated in the present work show that Native Americans form the most differentiated group of all the continental populations that form the Brazilian population. Therefore, we believe that, conjunctively, these two factors may explain the observed differentiation in the Northern Brazil population.
Our analyses involving potentially deleterious alleles demonstrate that the Native American population presents a higher proportion (50 %) of these alleles compared with the other parental populations (44 and 35 % for African and European, respectively). This analysis holds when considering the proportion of carriers with six or more (among eight possible) potentially deleterious alleles.
However, we cannot evaluate how this genetic composition may be associated with a higher incidence of cancer in this population because Native American populations form traditional communities that reside in outlying areas of urban centres. Epidemiological data are not available on the incidence of different types of cancer among these populations.
Despite the absence of epidemiological data, recent studies demonstrate that different forms of cancer are associated with a higher (or lower) contribution from Native American ancestry in admixed populations from South America. The investigations were associated with acute lymphoblastic leukaemia [44], breast cancer [45] and gastric cancer [45][46][47].

Conclusion
In summary, our study shows that the allelic distribution of the IL-1A (rs3783553), IL4 (rs79071878), NFKB1 (rs28362491) and PAR1 (rs11267092) gene polymorphisms differs between European, African and Native American populations. Further, the same heterogeneity is not observed between regional populations in Brazil, except for the northern region, which significantly differs from the other four Brazilian regions.
Moreover, the results show that the Native American population includes a greater proportion of carriers with six or more alleles associated with cancer, which suggests that this population may have a higher risk of developing (or worse prognosis for) diseases associated with these alleles. The presented genetic data may be useful for future studies on the association between these polymorphisms and cancer in these populations.

Study population
The study population consists of 928 non-related and cancer-free adult individuals, recruited in ten Brazil-  South region. Additional details may be found in a previous study [48].
Informed consent for DNA analysis was obtained from healthy individuals for research purposes. Ethics approval was obtained from the local committee of Instituto de Ciências da Saúde, Universidade Federal do Pará.

Genotyping of investigated polymorphisms
Four polymorphisms were genotyped by a single multiplex reaction with Master Mix QIAGEN ® Multiplex PCR kit (Qiagen, Hilden, Germany) and the primers described in Table 4.
Multiplex PCR products were separated and analyzed by capillary electrophoresis on the ABI 3130 Genetic Analyzer instrument (Applied Biosystems), using GS-500 LIX as pattern of molecular weight (Applied Biosystems), G5 virtual filter matrix and POP7 (Applied Biosystems). After data collect, samples were analyzed in GeneMapper ® 3.7 software (Applied Biosystems).

Statistical analyses
Allelic and genotypic frequencies were obtained by direct counting and δ value (delta value) was determined by substracting values of allele frequency in the studied parental populations, as described by Santos et al. [39]. Hardy-Weinberg equilibrium deviations were tested in Arlequin 3.1 software [51]. Differences in genotypic frequencies between Brazilian regions and parental populations were measured by Chi-square test (χ 2 test, df = 2) in BioEstat software [52]. FDR (False Discovery Rate) method was used to correct multiple analyses [53]. These analyses were performed in the statistical package R Calculation. P value was considered significant if lower than 0.05.