Microsatellite markers for Urochloa humidicola (Poaceae) and their transferability to other Urochloa species

Background Urochloa humidicola is a warm-season grass commonly used as forage in the tropics and is recognized for its tolerance to seasonal flooding. This grass is an important forage species for the Cerrado and Amazon regions of Brazil. U. humidicola is a polyploid species with variable ploidy (6X–9X) and facultative apomixis with high phenotypic plasticity. However, this apomixis and ploidy, as well as the limited knowledge of the genetic basis of the germplasm collection, have constrained genetic breeding activities, yet microsatellite markers may enable a better understanding of the species’ genetic composition. This study aimed to develop and characterize new polymorphic microsatellite molecular markers in U. humidicola and to evaluate their transferability to other Urochloa species. Findings A set of microsatellite markers for U. humidicola was identified from two new enriched genomic DNA libraries: the first library was constructed from a single sexual genotype and the second from a pool of eight apomictic genotypes selected on the basis of previous results. Of the 114 loci developed, 72 primer pairs presented a good amplification product, and 64 were polymorphic among the 34 genotypes tested. The number of bands per simple sequence repeat (SSR) locus ranged from 1 to 29, with a mean of 9.6 bands per locus. The mean polymorphism information content (PIC) of all loci was 0.77, and the mean discrimination power (DP) was 0.87. STRUCTURE analysis revealed differences among U. humidicola accessions, hybrids, and other Urochloa accessions. The transferability of these microsatellites was evaluated in four species of the genus, U. brizantha, U. decumbens, U. ruziziensis, and U. dictyoneura, and the percentage of transferability ranged from 58.33% to 69.44% depending on the species. Conclusions This work reports new polymorphic microsatellite markers for U. humidicola that can be used for breeding programs of this and other Urochloa species, including genetic linkage mapping, quantitative trait loci identification, and marker-assisted selection.

In Brazil, the grasses of the genus Urochloa occupy 85% of the cultivated pasture areas [8]. U. humidicola is cultivated as forage in several tropical regions worldwide and is particularly recognized for its tolerance to poorly draining soils, seasonal flooding, and infertile acidic soils [9]. For this reason, this species has been largely exploited in the tropics as a forage option over other Urochloa grasses, mostly in the African savannas and similar environments, such as the Brazilian Cerrado [7].
The development and adoption of new U. humidicola cultivars with a broad genetic base are crucial for the diversification of forage pastures in the tropics, primarily because there are few cultivars of this species in Brazil (Tully, Llanero, and BRS Tupi). However, the development of new cultivars must be a dynamic process, providing cultivars with high nutritional value, increased biotic and abiotic resistance, and economic competitiveness.
Molecular markers are important tools to the progress of breeding programs, and their utilization would favor a more dynamic development of new cultivars of this species. However, there is a lack of information about the U. humidicola genome. Indeed, little or nothing is known about the number of genes, distribution of gene families, abundance and diversity of retro-elements, QTL localization of traits of economic importance, genome colinearity with model species, or abundance of repetitive sequences. Molecular markers are widely used in the fingerprinting of cultivars, the detection of genetic diversity in evaluating population structure in the mapping genes of interest, and in the selection of elite genotypes in breeding programs. SSR markers, in particular, are often used due to their codominant and multi-allelic characteristics [10]; moreover, they are highly site specific and transferable to related species [11].
Some microsatellite markers have already been developed for U. humidicola [12,13] and have been used for germplasm diversity studies [7,13], with all of them from the same microsatellite-enriched library constructed from genotype H016. Moreover, our research group identified four different gene pools among U. humidicola accessions; genotype H031 was found to be completely different from all other accessions, which was verified by a population structure analysis and by the fact that 18.5% of the tested markers did not amplify in this accession [7]. As a large number of markers are necessary for molecular breeding programs, our goal was to isolate and characterize new polymorphic microsatellite markers for U. humidicola genotype H031 (accession 12) to ensure that its genome was well represented by the new set of markers and also different accessions that belong to different gene pools and to test the transferability of these markers to four other Urochloa species (U. brizantha, U. decumbens, U. ruziziensis, and U. dictyoneura). The results were compared with previously reported data [12,13].

Methods
The plant material for library construction and marker validation was obtained from young leaves from several Urochloa genotypes. For the first library (Lb-1) construction, a single sexual genotype (H031) was used. For the second library (Lb-2) construction, a pool of eight apomictic genotypes (H010, H013, H015, H034, H041, H043, H101, and H108) was used. For marker validation, 34 genotypes were selected, consisting of 20 U. humidicola germplasm accessions, six intra-specific hybrids, and eight Urochloa accessions, as represented by two different accessions from each of the following species: U. brizantha, U. decumbens, U. ruziziensis, and U. dictyoneura. These genotypes were selected based on the four gene pools found by a previous study [7], from which two genotypes were selected from each gene pool. All of the accessions used are from the Urochloa germplasm collection maintained at Embrapa Beef Cattle, Campo Grande, MS, Brazil. They have been personally identified by S. A. Renvoize, from the Royal Botanic Gardens, Kew, UK and their identity have been confirmed by C. B. do Valle when transferred to Brazil [9]. The annotation numbers, accession numbers (as recorded in Embrapa Beef Cattle (EBC) and Center for Tropical Agriculture (CIAT)), genotypes, and species identifications are shown in Table 1. Genomic DNA was extracted from freeze-dried leaf samples using the CTAB method [14]. The DNA samples were evaluated on a 1% agarose gel and quantified by comparison to known quantities of uncut λ phage DNA (Invitrogen, Carlsbad, CA, USA).
Genomic DNA was restriction digested with Afa I (Invitrogen), enriched in microsatellite fragments using (CT)8 and (GT)8 probes, and then used to construct a microsatellite-enriched library following the protocol of Billotte et al. [15]. The enriched microsatellite fragments were cloned into pGEM-T (Promega, Madison, WI), and the ligation products were used to transform Escherichia coli XL1-Blue competent cells. All 94 clones from both libraries were sequenced with an ABI 377 automated sequencer (Applied Biosystems, Foster City, CA) using the BigDye terminator cycle sequencing kit (Applied Biosystems, Foster City, CA).
The microsatellites were identified using MISA software [16]. Only mono-nucleotides with twelve or more repeats, di-nucleotides with six or more repeats, trinucleotides with four or more repeats, and tetra-, penta-, and hexa-nucleotides with three or more repeats were considered. Primer pairs were designed using the Primer Select 5.01 software (DNASTAR Inc.) and the Primer3Plus software [17]. Polymerase chain reactions (PCRs) were carried out as previously described [12]. The amplification products were resolved by electrophoresis through 3% agarose gels prior to vertical electrophoresis through 6% denaturing polyacrylamide gels. The gels were then silver stained [18], and the product sizes were determined by comparison to a 10-bp DNA ladder (Invitrogen, Carlsbad, CA).
Polyploid microsatellite genotyping is difficult due to the closeness of fragment sizes, stutter peaks observed and allele overlap due to multiple alleles of the same size. Few methods have been developed to overcome allele overlapping and estimate the allele frequencies, such as the estimation of alleles based on the electropherogram peak ratios [19] or the statistical estimation of allele frequencies [20]. However, for the present study work, we restricted the project to describe the new SSR markers, which were visually scored based on the presence (1) or absence (0) of a band in the polyacrilamide gels for each of the Urochloa genotypes. PIC (Polymorphic Information Content) [21] and DP (Discriminatory Power) [22] values were calculated to estimate polymorphisms at each locus.
The microsatellite scores for the 34 individuals were evaluated using a model-based method with Bayesian clustering approach in STRUCTURE software version 2.2 [23][24][25]. The admixture model was tested with 200,000 replicates for burn-in and 100,000 replicates for Markov Chain Monte Carlo (MCMC) processes through ten iterations (runs). The numbers of clusters (K) were tested from 2 to 20. The optimal number of clusters was estimated using the ΔK value, as previously described [26], and the final graphs were visualized using the STRUCTURE HARVESTER software [27]. The individuals were grouped into clusters according to the association coefficient (Q) proportion of each allelic pool in an individual.
A joint analysis (Lb-c) was performed with the data from the polymorphic loci derived from the new libraries Lb-1 and Lb-2. Data from a previous study [12] that used SSRs developed from accession 9 (H016) were used to compare the three libraries. The data were reanalyzed under the same parameters as those used for the new libraries, resulting in Lb-3. Another joint analysis (Lb-ct) was performed with data from the three libraries together (Lb-1, Lb-2, and Lb-3). The results obtained by STRUCTURE software were permuted by CLUMPP software [28], and the figures were generated using DIS-TRUCT software [29].

Results
Microsatellite enrichment success for the U. humidicola DNA libraries was 79.0% for Lb-1 and 61.2% for Lb-2. From all of the sequenced clones, 183 microsatellites were identified. Di-nucleotide repeats were the most abundant class of microsatellites detected, representing 76.4% and 72.7% of the loci for Lb-1 and Lb-2, respectively, followed by mono-nucleotide and tetra-nucleotide repeats. Perfect microsatellites were the most abundant ( Table 2). Of the 114 SSR primer pairs designed and tested, 72 were successfully amplified in U. humidicola genotypes, and 64 SSRs were polymorphic. A description of the number of alleles per locus and PIC and DP values for both the U. humidicola accessions and Urochloa accessions is presented in Table 3. The loci BhUNICAMP68 to BhUNICAMP108 are derived from Lb-1, and the loci BhUNICAMP109 to BhUNICAMP139 are derived from Lb-2. Based on the allelic frequencies estimated by STRUCTURE software, 36.43% of the alleles are rare (frequency < 0.05), 60.06% are intermediate alleles (0.05 < frequency < 0.30), and 3.50% are abundant alleles (frequency > 0. 30).
A survey of the potential transferability of the microsatellite markers from U. humidicola to other Urochloa species identified that 61.11% of the 72 markers resulted in amplified PCR products in at least one U. brizantha genotype, 58.33% were amplified in U. decumbens, 59.72% were amplified in U. ruziziensis, and 69.44% were amplified in U. dictyoneura. The number of successfully amplified genotypes per number of genotypes tested per species is shown in Table 4.
The population structure analysis based on SSR allelic data showed differentiation among the U. humidicola accessions, hybrids, and other Urochloa species. The STRUCTURE analysis for Lb-1 and Lb-2 and the joint analysis of data from both libraries (Lb-c) showed K = 18, K = 17, and K = 17 allelic pools, respectively, with each one represented by a different color in Figure 1. Clusters I to V were composed of U. humidicola accessions. Cluster VI was composed of two U. humidicola accessions (accessions 9 and 12) and six hybrids derived from a controlled cross between these two accessions. The other Urochloa species were grouped into Clusters VII and VIII for Lb-1 and Lb-c and in Cluster VII for Lb-2.
The STRUCTURE analysis for Lb-3 and Lb-ct showed K = 15 and K = 18 allelic pools, respectively (Figure 1), classified in the same clusters as for Lb-1 and Lb-c.

Discussion
In the present study, we described 72 new SSRs for U. humidicola, 64 of which are polymorphic. Along with the 67 previous developed SSRs [12,13], these markers contribute to the genetic breeding of the species and other species of the genus Urochloa in efforts to obtain new cultivars and better understanding of the species genetic, through genetic mapping, marker-assisted selection, genome sequencing and synteny.
The increased occurrence of di-nucleotide motifs for Lb-1 and Lb-2 is in accordance with the enrichment of both libraries with (CT) 8 and (GT) 8 probes. In addition, Morgante et al. [30] reported a higher occurrence of microsatellites with di-nucleotide motifs in plants, which may have been a contributing factor in our observation.
Among the microsatellites analyzed, 88% had a polymorphism among the evaluated genotypes. The most informative loci in this panel of SSRs were those with the highest PIC and DP values (BhUNICAMP075 and BhUNICAMP107). Locus BhUNICAMP094 showed the lowest values for PIC and DP, at 0.3165 and 0.3969, respectively, even though it was amplified in all the Urochloa species evaluated. This also occurred with the BhUNICAMP030 locus [12]. Both loci may be useful markers for studies in Urochloa because it may be the result of a conserved region among the species studied herein. Monomorphic loci may be useful in other studies.
The transferability rates of the loci from U. humidicola to four other species were very similar. Although these results were not highly variable, U. dictyoneura presented the highest transferability, corroborating the genetic closeness between U. dictyoneura and U. humidicola, as has been previously described [2,31] and the results obtained in another study [13].
For the population structure analysis, different numbers of allelic pools [K] were observed for all analyses. However, the individual composition presented in each cluster was maintained into Lb-1, Lb-c, Lb-3, and Lb-ct     analyses, but for Lb-2 analysis, the Clusters VII and VIII were grouped into Cluster VII.
The genotypes of the species U. brizantha, U. decumbens, and U. ruziziensis were grouped into the same cluster in all the analyses. However, the U. dictyoneura genotypes were grouped separately from the other species for all the analyses, except for Lb-2, with the four species grouping into Cluster VII.    In all analyses, Cluster VI included accessions 9 and 12, and six hybrids derived from crosses between these two accessions grouped together. However, in a previous study, the progenitors did not group together with the hybrids [13], as only runs from K = 1 to K = 10 were performed. These hybrids are part of an F 1 population that is being mapped with the SSRs described in this study and previously published [12,13].
Although some discrepancies were found among the three libraries (Lb-1, Lb-2, and Lb-3), the set of loci belonging to each was able to satisfactorily differentiate the accessions. Comparing the three libraries developed, Lb-1 presented the highest number of allelic pools, which may be correlated to the usage of the accession H031, a highly diverse genotype, as described by [7]. The genotype used for the enriched library construction directly influences the results. The joint analysis of the three libraries (Lb-ct) would be the most recommended way to differentiate among accessions, because it uses loci derived from many different genotypes, conferring a greater reliability of the observed results.
These markers are immediately useful for U. humidicola breeding programs, aiding in areas such as the construction of linkage and QTL maps, gene flow and mating system evaluation, and marker-assisted selection.

Availability of supporting data
The datasets supporting the results of this article are included in the article.