HLA Class I and II profiles in São Miguel Island (Azores): genetic diversity and linkage disequilibrium

Background Human leukocyte antigen (HLA) genes are characterized by high levels of polymorphism and linkage disequilibrium (LD), important characteristics to study the genetic background of human populations and their genetic structure. Here, we analyse the allele distribution and LD extent of HLA class I and II in São Miguel Island population (Azores archipelago, Portugal). Findings The sample set was composed of 106 healthy blood donors living in São Miguel Island obtained from the anonymized Azorean DNA bank. HLA class I (-A, -B and -Cw) and class II (-DRB1, -DQB1, -DPA1 and -DPB1) genotyping was performed by PCR-SSP Olerup SSP™ (GenoVision Inc.), according to the manufacturer's instructions. Genetic diversity values, based on the 7 loci, ranged from 0.821 both for HLA-DPA1 and -DQB1 to 0.934 for HLA-B, with a mean value of 0.846. Analysis of 5 HLA-A-Cw-B-DRB1-DQB1 haplotypes revealed that A*01-Cw*07-B*08-DRB1*03-DQB1*02 is the most frequent in São Miguel (7.9%) followed by A*24-B*08-Cw*07-DRB1*03-DQB1*02 (3.8%). In addition, even though the reports of high LD for HLA markers in worldwide populations, São Miguel islanders do not have extensive LD (average D' = 0.285). Conclusions In summary, the results demonstrate high variability of HLA in São Miguel Island population as well as absence of genetic structure and extensive LD. The data here presented suggest that in São Miguel islanders autoimmune diseases studies will necessarily encompass a more focused analysis of HLA extended haplotypes as well as the evaluation of other non-HLA candidate genes.


Background
The Azores is a Portuguese archipelago composed of nine islands distributed by three geographical groups: the Eastern (São Miguel and Santa Maria), the Central (Terceira, Pico, Faial, São Jorge and Graciosa) and the Western (Flores and Corvo). The Portuguese explorers, who discovered the archipelago in 1427, only started the settlement in 1439 through a long and difficult process. Historical data report a contribution from people with genetic backgrounds other than Portuguese, including Flemish, Spanish, French, Italian, German, Scottish, Jewish, and also from Moorish prisoners and black slaves from Guinea, Cape Verde and São Tomé [1]. São Miguel is the largest island of the Azores and is composed of 131,609 inhabitants (2001 Census, Portugal National Institute of Statistics). Several studies have been performed to characterize the genetic pool of the Azoreans [2][3][4][5][6][7][8][9][10]. These studies report a high genetic variability and heterogeneity of the Azorean population, explained by the settling history of the islands, where a major contribution of mainland Portugal individuals is evident. Moreover, the data revealed absence of population structure, even though the archipelago's geographical discontinuity and demographic disproportionality. Currently, this knowledge is being fundamental for the design and development of pharmacogenetic research and genetic studies in common diseases, such as cardiovascular and autoimmune diseases.
The human leukocyte antigen (HLA) genes, a central component of the major histocompatibility complex (MHC) on 6p21.3, encode polymorphic class I, II and III molecules that play a major role in the immune response [11]. In addition, HLA loci are characterized by high levels of polymorphism and linkage disequilibrium (LD), important characteristics to study the genetic background of human populations, as well as their presentday genetic structure. Here, we analyse the allele frequency and LD extent of HLA class I and II, in order to identify its diversity and haplotype distribution and to gain further insight in the potential use of this genomic region for the study of autoimmune diseases in the São Miguel Island population.

Population samples, genotyping and statistical analysis
The sample set was composed of 106 healthy blood donors living in São Miguel Island obtained from the anonymized Azorean DNA bank located at the Hospital of Divino Espírito Santo of Ponta Delgada, EPE, the main hospital in Azores [12]. HLA class I (-A, -Cw and -B) and class II (-DRB1, -DQB1, -DPA1 and -DPB1) genotyping was performed by PCR-SSP Olerup SSP™ (GenoVision Inc.), according to the manufacturer's instructions. After electrophoresis on a 4% agarose gel stained with SYBR ® Green, PCR products were visualized, followed by HLA allele identification using the Helmberg-SCORE™ software version 3.320T (Olerup SSP AB, Saltsjöbaden, Sweden).
Average gene diversity and estimation of the HLA haplotypes was carried out using Arlequin v3.0 [13]. Evaluation of standardized multiallelic disequilibrium coefficient, D', was performed by the Haploxt application from the GOLD software. Average D' values were calculated by a simple mathematical mean of all values obtained for each marker pair. Nei's F ST genetic distance matrix was computed between pairs of populations by DISPAN [14] and used to construct a Neighbor-Joining (NJ) tree by PHYLIP 3.63 [15]. We employed TreeView 1.6.6 [16] to display tree phylogenies obtained from NJ. In order to obtain the best results concerning population comparisons a compromise between the number of populations and HLA loci was performed. Consequently, HLA-DPA1 and -DPB1 were excluded from analysis.

Results
The analysis of the HLA alleles in the São Miguel Island population (Table 1) revealed for the HLA-A locus a total of 16 different alleles, 13 HLA-Cw and 24 HLA-B alleles. Regarding HLA class II loci, we found 22 HLA-DPB1, 13 HLA-DRB1, 5 HLA-DQB1 and 6 HLA-DPA1 different alleles. HLA-B and HLA-DPB1 are the two loci with the highest numbers of alleles, suggesting higher diversity for these markers. The highest frequency observed, 0.462, was in HLA-DPA1 gene, which shows a low number of alleles. In contrast, the lowest frequency identified (0.5%) was present in HLA-A, -B and -DPB1 (Table 1). Genetic diversity values ranged from 0.821 both for HLA-DPA1 and -DQB1 to 0.934 for HLA-B, with a mean value of 0.846 (Table 2). Overall, HLA allele frequencies in São Miguel, mainland Portugal and other European populations demonstrated absence of statistically significant differences (G ST = 0.03; data not shown). According to Wright [17] values of G ST smaller than 0.05 indicate little genetic differentiation.
Linkage disequilibrium was based on the calculation of standardized multiallelic disequilibrium coefficient, D'. The range values are 0.163 for HLA markers DPA1-DQB1 and 0.712 for DQB1-DRB1 (Table 2). This wide variation averages 0.285 for the 7 loci. Curiously, the genetically closest markers (DPA1-DPB1, 0.011 Mb; D' = 0.398) do not present the highest value of D' (DQB1-DRB1, 0.081 Mb; D' = 0.712). A poor correlation between distance (Mb) and D' is observed, although there is a decrease of LD values over physical distance increase, as expected.
In order to obtain a graphical view of the genetic similarity between São Miguel (106 individuals, 5 HLA loci) and other populations, we computed Nei's genetic distances and depicted them in Figure 1. Interestingly, São Miguel is closer to Morocco population than to Terceira,

Discussion
Extensive studies have been performed in several geographical areas to characterize the diversity of HLA genetic markers. These evaluations allow a better knowledge of the population structure considering non-neutral markers, as well as an understanding of the influence of evolutionary processes in the overall signature of a population. These genetic data are crucial for the comprehen-sion of the molecular ethiology and epidemiology of common diseases. In general, the data here presented corroborate previous works [3,[6][7][8][9][10]  markers in mainland Portugal (3 loci, -A, -B and -DRB1, [18]) and in Azores (6 loci, -A, -Cw, -B, -DRB1, -DQA1 and -DQB1, [5]) demonstrate values of average diversity of 0.92 in both populations. The results obtained in the present study, based in 7 loci, showed a smaller value (0.84). This may be explained by the fact that Spinola et al. [5] used a high-resolution methodology to genotype HLA. Because alleles A*0101 and A*0102 are not considered the same allele (A*01), this methodology allows the identification of a higher number of different alleles. Nonetheless, the data show no significant differences between allele frequencies in São Miguel and Terceira islands. Considering HLA alleles distribution, the presence of -A*30 and -A*80, commonly found in sub-Saharan populations [19][20][21], in São Miguel validates historic records of slave settlers. In addition, the presence of alleles -B*35, -B*57 and -B*15 suggest a direct contribu-tion of Moorish prisoners in Azores [22][23][24]. Nevertheless, the influence of early Portuguese settlers can not be ruled out since allele frequencies are similar. In general, these results are corroborated by the NJ tree (Figure 1), where São Miguel shows influence of both African and European populations. Linkage disequilibrium is considered a good measure of population structure. According to Sanchez-Mazas [25] HLA-DPB1, located on the centromeric side of the HLA chromosomal region, does not show high values of LD with the other HLA loci. Interestingly, in the present study, the lowest values of D' observed are related with this marker. This result is explained by the high recombination region involving one or several hotspots, which separates HLA-DPB1 from the rest of the other HLA loci. Abecasis et al. [26] discuss that a value of D' = 0.33, which corresponds to a 10-fold increase in the required sample size, is commonly taken as the minimum usable amount of LD. Considering the 21 possible HLA loci combinations, 17 demonstrated values inferior to 0.33, and only 2 (Cw-B and DQB1-DRB1) showed values significantly higher (0.571 and 0.712, respectively). The HLA data reported by Meyer et al. [27] indicate a significant LD between all HLA loci in around 40 worldwide studied populations. The present research did not indicate large D' values and corroborates the results obtained by Service et al. [28] and Branco et al. [9,10], where the Azoreans have the lowest values of LD when compared with isolated and outbred populations.
HLA diversity in human populations is an important aspect of disease epidemiology, especially autoimmune disorders, such as type I diabetes, ankylosing spondylitis and celiac disease. According to Bakker et al. [29], the association of HLA alleles and/or haplotypes with disease susceptibility may be confounded by the presence of population stratification in neighboring HLA and non-HLA genomic regions. The high variability of HLA, and the absence of genetic structure and extensive LD, here demonstrated, suggest that autoimmune diseases studies in São Miguel islanders will necessarily encompass a more focused analysis of HLA extended haplotypes, as well as the evaluation of other non-HLA candidate genes.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions PRP and CCB, contributed equally, by performing the experiments, statistical analysis and drafting the manuscript. CTG and RC genotyped individuals from the patients sample and provide technical help, respectively. LMV provided sci-Additional file 1 Supplemental data to Results. Details each haplotype found in the São Miguel Island considering 5 HLA loci (A*-Cw*B*-DRB1*-DQB1) as well as their relative frequency. entific orientation and revised the manuscript. All authors read and approved the final manuscript.