Association of genetic variation in the NR1H4 gene, encoding the nuclear bile acid receptor FXR, with inflammatory bowel disease

Background Pathogenesis of inflammatory bowel diseases (IBD), ulcerative colitis (UC) and Crohn’s disease (CD), involves interaction between environmental factors and inappropriate immune responses in the intestine of genetically predisposed individuals. Bile acids and their nuclear receptor, FXR, regulate inflammatory responses and barrier function in the intestinal tract. Methods We studied the association of five variants (rs3863377, rs7138843, rs56163822, rs35724, rs10860603) of the NR1H4 gene encoding FXR with IBD. 1138 individuals (591 non-IBD, 203 UC, 344 CD) were genotyped for five NR1H4 genetic variants with TaqMan SNP Genotyping Assays. Results We observed that the NR1H4 SNP rs3863377 is significantly less frequent in IBD cases than in non-IBD controls (allele frequencies: P = 0.004; wild-type vs. SNP carrier genotype frequencies: P = 0.008), whereas the variant rs56163822 is less prevalent in non-IBD controls (allele frequencies: P = 0.027; wild-type vs. SNP carrier genotype frequencies: P = 0.035). The global haplotype distribution between IBD and control patients was significantly different (P = 0.003). This also held true for the comparison between non-IBD and UC groups (P = 0.004), but not for the comparison between non-IBD and CD groups (P = 0.079). Conclusions We show that genetic variation in FXR is associated with IBD, further emphasizing the link between bile acid signaling and intestinal inflammation.


Background
Inflammatory bowel disease (IBD) is a chronic condition characterized by recurrent inflammation of intestinal mucosa, and results from aberrant regulation of the mucosal immune system in genetically susceptible individuals. The etiology of IBD involves a complex interaction of genetic, environmental, and immunomodulatory factors. Two major forms of chronic mucosal inflammation have been defined. In Crohn's disease (CD), the whole gastrointestinal tract may be affected, although the most frequent site of inflammation is the terminal ileum, whereas in ulcerative colitis (UC) the mucosal inflammation typically affects the colon [1]. For CD pathogenesis, a strong genetic component has been suggested by the concordance of 63.6% in monozygotic twins, but only of 3.6% in dizygotic twins. The concordance of monozygotic twins is lower (6%) in UC, indicating that genetic susceptibility may play a somewhat smaller role in this disease [2].
Nuclear receptors are a large family of transcription factors that are involved in the regulation of numerous processes, including reproduction, development, and a wide range of metabolic pathways [3]. The liganddependent activation function at the carboxy-terminus of most nuclear receptors allows them to sense metabolic changes within cells, and orchestrate rapid transcriptional changes in response [4][5][6].
The farnesoid X receptor (FXR; gene symbol NR1H4) is a nuclear receptor that functions as the main sensor of intracellular bile acid levels [7][8][9]. The human NR1H4 gene is located on chromosome 12 and is composed of 11 exons and 10 introns [10]. The translation initiation codon of the NR1H4 gene lies at the 3 0 end of exon 3, whereas exons 1 and 2, together with the 5 0 region of exon 3, contain the 5 0 untranslated region (5'-UTR). Multiple FXR isoforms can be generated via alternative promoter usage and alternative splicing, and these isoforms may have differential transactivation abilities on specific target promoters [11]. FXR typically acts by binding to FXR response elements within the target promoters as heterodimers with another member of the nuclear receptor family, retinoid X receptor-α (RXRα) [12]. In response to elevated levels of intracellular bile acids, activated FXR is well known to induce protective gene expression circuits against bile acid toxicity in the liver and intestine [13]. Expression of bile acid efflux systems in ileocytes (organic solute transporter α/β; OSTα/β) [14,15] and hepatocytes (bile salt export pump; BSEP) [16][17][18] is upregulated by bile acid-activated FXR, while the expression of the respective bile acid uptake systems apical sodium-dependent bile acid transporter (ASBT) [19] and Na + -taurocholate cotransporting polypeptide (NTCP) [20,21] is suppressed by it. FXR also represses transcription of three genes coding for bile acid synthesizing enzymes, namely cholesterol-7α-hydroxylase (CYP7A1), sterol-12α-hydroxylase (CYB8B1), and sterol-27-hydroxylase (CYP27A1) [22,23]. Thus, elevated levels of bile acids can suppress their own de novo production through a negative feedback loop involving FXR. In addition, FXR regulates several genes that can protect against intestinal inflammation and bacterial overgrowth [24][25][26]. Fxr-deficient mice have increased ileal concentrations of gut bacteria and exhibit defects in the integrity of the intestinal epithelial barrier. In agreement with this, the products of a number of genes that are regulated by Fxr in the ileum, including angiogenin (Ang1), inducible nitric oxide synthase (iNos), and interleukin-18 , are known to have antimicrobial actions [26]. Furthermore, it has been reported that reduced expression of Fxr/FXR is associated with colon inflammation in rodent models of colitis and in CD patients [25]. Recently, FXR activation was shown to decrease NF-κBmediated immune responses and intestinal permeability in mouse models of colitis [27]. It was subsequently shown that intestinal inflammation reduces FXR activation as well as the expression of FXR target genes such as intestinal bile acid-binding protein (IBABP) and fibroblast growth factor 15/19 (FGF15/19) [28]. In agreement with this, it has been proposed that FXR may contribute to the resistance of both human and mouse gastric epithelial cells against inflammation-induced injury [29]. The fact that FXR thus appears to play a role in the protection of the integrity of the intestinal epithelial barrier and its inverse correlation with the level of intestinal inflammation suggest a potential connection between FXR and the molecular pathogenesis of IBD. FXR variants have been previously studied in association with several liver diseases, such as gallstone disease [30], cholangiocarcinoma [31], intrahepatic cholestasis of pregnancy [32], and idiopathic infantile cholestasis [33]. Here, we have investigated five NR1H4 single nucleotide polymorphisms -two common SNPs and three rare variants -which have been previously studied in the context of human disease, in a well-sized IBD vs. non-IBD cohort, and report that two of these genetic variants are associated with IBD.

Study population
The study population was European and consisted of more women (806) than men (332). Detailed demographic data is given in Table 1.

NR1H4 sequence variability
All five NR1H4 variants selected for the study are single nucleotide substitutions, previously identified within the NR1H4 gene (www.ncbi.nlm.nih.gov/snp/). Three of these SNPs can be considered as rare variants: rs3863377, rs56163822, rs7138843, with reported minor allele frequencies (MAF) of 4%, 2.2%, and 0.9%, respectively. The other two variants, rs10860603 and rs35724, are common SNPs, with minor allele frequencies of 20.5% and 40.8% (www.ncbi.nlm.nih.gov/snp/) in European individuals. The genotype frequencies in all groups were in Hardy-Weinberg equilibrium (represented by the χ 2 values in Table 2). The obtained allele and genotype frequencies are given in Tables 3 and 2, respectively.

Genetic variation in the NR1H4 gene and IBD
The NR1H4 SNP variants rs3863377 and rs56163822 were found to be significantly associated with IBD when considering an uncorrected significance level of p<0.05. Upon statistical analysis of the allele (Table 3) and genotype ( Table 2) frequencies, we observed that the NR1H4 variant rs3863377 is significantly less frequent in IBD cases than in non-IBD controls (allele frequencies: P = 0.004; wild-type vs. SNP carrier genotype frequencies: P = 0.008) even when considering an Bonferronicorrected significance level of P<0.01. Upon subgrouping the IBD patients, the significance of the inverse association of the rs3863377 SNP remained for the CD patients when considering an uncorrected significance level of P<0.05 (allele frequencies: P = 0.015; wild-type vs. SNP carrier genotype frequencies: P = 0.024), but not for the UC group (allele frequencies: P = 0.075; wild-type vs. SNP carrier genotype frequencies: P = 0.083). Conversely, the variant rs56163822 is less prevalent in non-IBD subjects than in IBD patients (allele frequencies: P = 0.027; wild-type vs. SNP carrier genotype frequencies: P = 0.035); an observation, which is, however, not significant, when considering a corrected significance level of P<0.01. Upon subgrouping the patient cohort according to IBD subtypes, the uncorrected association remained only significant for the UC group (allele frequencies: P = 0.036; wild-type vs. SNP carrier genotype frequencies: P = 0.034). Upon adjustment for age and gender, the uncorrected significance (P<0.05) of genotype frequency association with IBD remained for rs3863377, but was reduced to P>0.05 for rs56163822. There were no significant differences in the allele frequency distribution between the subject groups for NR1H4 variants rs7138843, rs10860603, and rs35724.

NR1H4 haplotype analysis
All individuals, for whom genotype determination could be performed for all five NR1H4 SNPs under study were included in the haplotype prediction analyses. Four hundred and eighty-six non-IBD cases, along with 243 CD patients and 150 UC patients were thus haplotyped. Twenty haplotypes and up to thirty-nine diplotypes were predicted by the software FAMHAP to exist in the studied cohort. NR1H4 haplotypes were significantly differentially distributed in the IBD and control groups ( Table 4, P = 0.003) upon global haplotype distribution analysis. This observation held partially true upon stratification according to disease subtype. Here, the haplotype frequencies differ significantly between the UC patients and the non-IBD control group (Table 5, P = 0.004), but not between the CD patients and the control subjects ( Table 6, P = 0.079). We particularly note According to the National Center for Biotechnology Information (NCBI) SNP database. Chi-square (χ 2 ) test was used to determine statistical significance, except that Fisher's exact test was used when the cell count was 10 or less ( †). *, P<0.05; **, P<0.01. P' , P value adjusted to age and gender. OR, odds ratio; CI, confidence interval. ns, not significant. Table 4 Haplotype analysis of IBD patients and non-IBD controls   The calculation of odds ratios and a global P-value for haplotype distribution was performed on these results using FAMHAP. The haplotype order follows a priority ranking within the control group. Nineteen haplotypes were predicted. The haplotype base positions correspond to a) rs3863377, b) rs56163822, c) rs7138843, d) rs10860603, and e) rs35724. As FAMHAP tested all haplotypes with a frequency >0.01, the frequency distribution of eight haplotypes (1-7, 14) was included. Relevant P-values for four haplotypes were displayed by FAMHAP. One single haplotype (14) was significantly differentially distributed between the UC and control groups. OR, odds ratio; NA, not applicable, FAMHAP did not calculate an odds ratio on these results because of a low haplotype frequency value predicted. Right table side (italics): Most likely occurring haplotype frequencies (haplotypes in best reconstruction) predicted by FAMHAP. Based on these predictions significance tests were performed. P-values were calculated using the Fisher's exact test or (if marked with a) ) the Chi-Square-test. A Bonferroni-corrected P-value of 0.003 (19 tests) was taken as significance level. According to this, none of the haplotypes appeared to be significantly different distributed between the UC and control groups. OR, odds ratio; C.I., confidence interval; NA, OR not applicable because of at least one cell count with the value of null; ns, not significant and P > 0.05. The calculation of odds ratios and a global P-value for haplotype distribution was performed on these results using FAMHAP. The haplotype order follows a priority ranking within the control group. Nineteen haplotypes were predicted. The haplotype base positions correspond to a) rs3863377, b) rs56163822, c) rs7138843, d) rs10860603, and e) rs35724. As FAMHAP tested all haplotypes with a frequency >0.01, the frequency distribution of eight haplotypes (1)(2)(3)(4)(5)(6)(7)9) was included. Relevant P-values for three haplotypes were displayed by FAMHAP. None of the individual haplotypes are significantly differentially distributed between the CD and control groups. OR, odds ratio; NA, not applicable, FAMHAP did not calculate an odds ratio on these results because of a low haplotype frequency value predicted. Right table side (italics): Most likely occurring haplotype frequencies (haplotypes in best reconstruction) predicted by FAMHAP. Based on these predictions significance tests were performed. P-values were calculated using the Fisher's exact test or (if marked with a) ) the Chi-Square-test. A Bonferroni-corrected P-value of 0.003 (19 tests) was taken as significance level. According to this, none of the haplotypes appeared to be significantly different distributed between the CD and control groups. OR, odds ratio; C.I., confidence interval; NA, OR not applicable because of at least one cell count with the value of null; ns, not significant and P > 0.05.
that the haplotype 14, GTTGC, is predicted to occur significantly (P = 0.005) more frequently in the UC group in comparison with the non-IBD cohort. This haplotype harbours the more frequent allele G at the first SNP position rs3863377, which was shown to be significantly associated with IBD even after Bonferroni correction. It is thus a possible risk haplotype for the development of IBD, although we note that the overall frequency for this haplotype is rather low. Upon best reconstruction analysis the significance of the association of the haplotype 14, GTTGC, with the UC group was, however, lost (P=0.012) ( Table 5, italicized section). No significant associations were observed for the predicted diplotype patterns and IBD (data not shown). As shown in the LD plot (Figure 1), there was no significant linkage disequilibrium between any of the five NR1H4 SNPs studied.

Discussion
The complex pathophysiology of IBD still remains largely unelucidated, although multiple factors, both genetic and environmental, are clearly involved. SNPs and mutations within several genes have been proposed to be associated with the risk to develop IBD. Prior studies have revealed more than 70 genes that are potentially associated with IBD [34,35]. The region on chromosome 16q11-12 named IBD1 was identified in 1996, and the fine mapping of this region led to the identification of the NOD2 (nucleotide-binding oligomerization domain 2)/CARD15 (caspase activation recruitment domain 15) genes [36,37], and a member of a family of pattern recognition receptors (PRRs) that recognizes microbial components and modifies inflammatory responses to bacterial triggers such as lipopolysaccharides (LPS), through the activation of NF-κB [38,39]. Furthermore, genes that play roles in immunological cell-cell interactions and signaling, such as the tumor necrosis factor receptor 1 (TNFR1) [40], the interleukin-23 receptor (IL23R) [41], and other genes that are involved in immune response to bacteria, such as the toll-like receptor 4 (TLR4) [42,43], have been proposed to be associated with IBD. In addition, regulatory genes, such as the protein tyrosine phosphatase N2 (PTPN2) [44,45] and the anti-inflammatory nuclear receptor peroxisome proliferator-activated receptor-γ (PPARγ) [46], as well as genes encoding membrane transporters multidrug resistance gene 1 (MDR1) [47][48][49] and the organic cation transporter 1/2 (OCTN1/ 2) [47,50] have been proposed to be associated with the risk of chronic mucosal inflammation.
In this report, we describe the identification of single nucleotide polymorphisms associated with the diagnosis of IBD within the NR1H4 gene, encoding the nuclear receptor for bile acids, FXR, in a well-sized European cohort. Five NR1H4 SNPs were analyzed, all of which have previously been studied in the context of other human disease conditions: rs3863377, rs7138843, rs56163822, rs35724, and rs10860603. The NR1H4 variants rs7138843 and rs56163822 have been previously shown to be inversely associated with cholelithiasis in a Mexican population and may thus play a protective role in gallstone disease, while the variant rs3863377 showed no association with cholelithiasis [51]. The variant rs56163822 was found to be more common in a British control group than in patients with intrahepatic cholestasis of pregnancy (ICP), although this difference did not reach statistical significance [32]. NR1H4 variants rs35724 and rs10860603 have been previously shown to be significantly associated with elevated body mass index and obesity [52]. IBD is often associated with hepatobiliary manifestations, [53,54] implying that the etiology of the diseases affecting the two organs, intestine and liver, may have common factors, also supported by our findings that the same NR1H4 genetic variants may be associated with both.
The variant rs3863377 is located in the 5' region of the NR1H4 gene, whereas the rs7138843 lies within the NR1H4 intron 7 and variants rs35724 and rs10860603 within NR1H4 intron 9. In none of these cases is it known, how the presence of the SNP may affect the expression and/or molecular function of FXR. As the SNP rs3863377 is located within the 5' region, it may alter a binding site for a transcription factor and may thus affect NR1H4 gene expression. The intronic SNPs rs7138843, rs10860603, and rs35724 could potentially influence splicing of the FXR mRNA. The substitution -1 G > T in rs56163822 lies in the base position adjacent to the translation initiation site, and was shown to lead to reduced FXR protein expression and decreased level of FXR-dependent promoter activation in human embryonic kidney cells. In another study the functional activity of the -1 G > T variant also appeared to be compromised, although transcriptional and translational efficiencies of the variant appeared comparable to the wild-type in cellfree assays and in HeLa cells [55]. Interestingly, the mRNA expression levels of the FXR target genes SHP and OATP1B3 are significantly reduced in the livers of the carriers of the rs56163822 allele, while the FXR mRNA expression level remains comparable, further indicating that this polymorphism may rather lead to weakened function than to reduced expression level of FXR.
In our current genotyping analysis we have found that for the NR1H4 variant rs3863377, the IBD population has a significantly lower frequency of carriers of the rarer allele than the healthy population, suggesting that this 5' region SNP may confer a protective effect against the disease. In the case of the rs56163822 NR1H4 variant, the rare allele is significantly more prevalent in the IBD population, suggesting that previously reported reduced FXR function exhibited by this variant may contribute to IBD pathogenesis. In the case of the rare NR1H4 variant under study, rs7138843, and the common SNPs rs10860603 and rs35724, no significant differences between the study populations were observed. In agreement with the associations observed for two of the five single SNP variants, the predicted global haplotype pattern was significantly different in IBD patients and non-IBD controls.
In our study, five NR1H4 SNPs were investigated. During the preparation of our manuscript, Nijmeijer et al. [56] published a study showing that mRNA expression of FXR and its target gene SHP are decreased in the ileum of Crohn's disease patients, in further support of the importance of the role for FXR in IBD. These authors also studied potential association of nine NR1H4 SNPs with IBD in a Dutch population, but did not discover any associations that remained significant upon correction for multiple testing. We note that in their analysis Nijmeijer et al. did not include the SNP rs3863377, the inverse association by which with IBD remained significant even after Bonferroni correction in the current study. As numerous further polymorphisms are known to exist in the NR1H4 gene (www.ncbi.nlm. nih.gov/snp/), our report, as well as that by Nijmeijer et al., serve as initial characterizations of the role of FXR genetic variants in IBD. Furthermore, these findings warrant further studies into genetic variants in the NR1H4 gene in the context of other inflammatory conditions affecting further tissues that express FXR.
FXR ligands, such as the hydrophilic bile acid ursodeoxycholic acid, have been proposed as attractive options for the therapy of liver diseases, such as cholestatic disease and non-alcoholic fatty liver disease [57]. Our finding that FXR genetic variants are associated with IBD, together with prior observations on FXR expression being altered in Crohn's disease [56] and on FXR promoting intestinal barrier integrity [27] and antibacterial defence [26], further emphasizes the potential benefits of FXR ligand administration also in IBD. We further speculate that testing for genetic variation in the NR1H4 gene may contribute to the early IBD diagnosis and prediction of therapy response in the future.

Conclusions
In conclusion, our results further support the role for FXR as a modulator of intestinal inflammation and as an important player in enteroprotection. The link between the bile acid receptor FXR and IBD also further emphasizes the potential importance of bile acid homeostasis and metabolism in the pathogenesis of IBD.

Study subjects
The study population was European, and comprised of 591 healthy subjects and 547 IBD patients, from which 203 were diagnosed to suffer from UC and 334 from CD. The IBD subjects were recruited at the centers participating in the Swiss Inflammatory Bowel Disease Cohort Study (SIBDCS) [58]. For the IBD patients, the diagnosis of UC or CD was confirmed by the study investigators based on clinical presentation, endoscopic findings, and histology. Non-IBD controls were recruited from gastroenterological patients undergoing surveillance colonoscopy, and showed no symptoms of IBD. History of colorectal cancer was used as an exclusion criterion for both IBD patients and non-IBD controls. All subjects provided their written informed consent to be included in the study. Ethical approvals were obtained from the local medical ethical committees of all study sites involved in the study: 1) The Swiss IBD Cohort

DNA extraction
Genomic DNAs were extracted from either EDTA-blood or intestinal biopsies using the QIAamp DNA Mini Kit (QIAGEN, Hombrechtikon, Switzerland) or the TRIzol reagent (Invitrogen, Basel, Switzerland), respectively, according to the manufacturer's instructions. The genomic DNAs were quantified with a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE) and diluted to a final concentration of 10 ng/μl.

Genotyping of NR1H4 single nucleotide polymorphisms
Genotyping of the five NR1H4 SNPs was performed using TaqMan allelic discrimination assays. The cycling was performed on an 7900HT Fast Real-Time PCR system (Applied Biosystems, Rotkreuz, Switzerland) by using the inventoried TaqMan SNP Genotyping Assays C_28000279_10, C_25598395, C_25598386_10, C_2366616_10, C_2800610_10 for the SNPs rs3863377, rs7138843, rs56163822, rs35724, and rs1086060, respectively. Twenty nanograms of each genomic DNA was used per PCR reaction in a volume of 5 μl. The amplification run conditions were: Once 50°C for 2 min, once 95°C for 10 min, 45 times 95°C for 15 sec, and 60°C for 1 min.

Statistical analysis
Statistical analysis for the individual SNP associations was performed using the software package SPSS 18 (SPSS Inc., Chicago, IL). The Chi-square test or Fisher's exact test were used to determine associations between individual SNPs and subject phenotypes. A P-value of <0.05 was considered as significant in non-corrected statistical tests and of <0.01 after correction for multiple testing for the five SNPs (according to Bonferroni). The software package PSPower (http://biostat.mc.vanderbilt. edu/twiki/bin/view/Main/PowerSampleSize) was used for retrospective power calculations. A retrospective power analysis of the applied statistical tests on genotype distributions revealed a power of 0.525 for the SNP rs3863377 and a power of 0.323 for the SNP rs56163822, when considering a Bonferroni-corrected alpha level of 0.01 and the detected ORs as shown in Table 2 for the applied χ 2 -tests. Linkage disequilibria (LD) were calculated using D' statistics and the software package Haploview (www.haploview.com). Haplotype predictions and frequency estimations were performed using the software tool FAMHAP (www.famhap.meb.uni-bonn.de). FAMHAP performs a permutation test on associations between estimated haplotypes and the affection state based on Monte Carlo simulations. The expectation maximization (EM) algorithm was used to obtain maximum-likelihood estimates of the haplotype frequencies of the sample composed of cases and controls. Individuals with several possible haplotype explanations are assigned with a likelihood weight to each possible haplotype and its calculated frequency estimate. A contingency table is constructed summing up all individuals' weighted haplotype explanations for each haplotype and the chi-square statistics computed. The corresponding P-value is assessed via Monte Carlo simulation, i.e. in each replication of the algorithm a sample composed of a subgroup of case and control samples is randomly drawn and permuted. FAMHAP implements the calculation of the global P-value via Monte Carlo simulations, as the cell counts used in the contingency table are based on haplotype frequency estimates with increased variances, not on real haplotype counts, which, as a result, does not necessarily follow exactly a chi-square distribution [59,60]. A value of P<0.05 was considered to be significant. Bonferroni-corrected P-values (P<0.006, corrected for eight haplotypes that FAMHAP considered to be relevant to test for) were defined as the significance level for single haplotype comparisons in the white sections of Tables 4-6. In addition, haplotypes in best reconstruction (not weighted) were listed for the case and control groups in Tables 4-6 (grey sections) and used for association analysis performing Fisher's exact tests or, in case of high cell counts (11 or more), Chi-square tests. Bonferroni-corrected significance levels (p<0.003, corrected for 19 haplotypes) were used for significance testing.