A method for genotyping elite breeding stocks of leaf chicory (Cichorium intybus L.) by assaying mapped microsatellite marker loci

Background Leaf chicory (Cichorium intybus subsp. intybus var. foliosum L.) is a diploid plant species (2n = 18) of the Asteraceae family. The term “chicory” specifies at least two types of cultivated plants: a leafy vegetable, which is highly differentiated with respect to several cultural types, and a root crop, whose current industrial utilization primarily addresses the extraction of inulin or the production of a coffee substitute. The populations grown are generally represented by local varieties (i.e., landraces) with high variation and adaptation to the natural and anthropological environment where they originated, and have been yearly selected and multiplied by farmers. Currently, molecular genetics and biotechnology are widely utilized in marker-assisted breeding programs in this species. In particular, molecular markers are becoming essential tools for developing parental lines with traits of interest and for assessing the specific combining ability of these lines to breed F1 hybrids. Results The present research deals with the implementation of an efficient method for genotyping elite breeding stocks developed from old landraces of leaf chicory, Radicchio of Chioggia, which are locally dominant in the Veneto region, using 27 microsatellite (SSR) marker loci scattered throughout the linkage groups. Information on the genetic diversity across molecular markers and plant accessions was successfully assessed along with descriptive statistics over all marker loci and inbred lines. Our overall data support an efficient method for assessing a multi-locus genotype of plant individuals and lineages that is useful for the selection of new varieties and the certification of local products derived from Radicchio of Chioggia. Conclusions This method proved to be useful for assessing the observed degree of homozygosity of the inbred lines as a measure of their genetic stability; plus it allowed an estimate of the specific combining ability (SCA) between maternal and paternal inbred lines on the basis of their genetic diversity and the predicted degree of heterozygosity of their F1 hybrids. This information could be exploited for planning crosses and predicting plant vigor traits (i.e., heterosis) of experimental F1 hybrids on the basis of the genetic distance and allelic divergence between parental inbred lines. Knowing the parental genotypes would allow us not only to protect newly registered varieties but also to assess the genetic purity and identity of the seed stocks of commercial F1 hybrids, and to certificate the origin of their food derivatives. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1819-z) contains supplementary material, which is available to authorized users.


Background
Cichorium (Cichorium intybus subsp. intybus var. foliosum L.) comprises diploid plant species (2n = 18) belonging to the Asteraceae family, subfamily Cichoriodeae, tribe Lactuceae or Cichorieae. These species are biennial or, in the wild, perennial species [1]. They are naturally allogamous due to an efficient sporophytic self-incompatibility system. In addition, outcrossing is promoted by a floral morpho-phenology unfavorable to selfing in the absence of pollen donors (i.e., proterandry, wherein the anthers mature before the pistils) and a favorable competition of allo-pollen grains and tubes (i.e., pollen that is genetically diverse from that produced by the seed parent, usually called auto-pollen) [2]. Long appreciated as medical plants by the ancient Greeks and Romans, Cichorium spp. are currently among the most important cultivated vegetable crops. They are generally used as components in fresh salads or, more rarely, cooked according to local traditions and alimentary habits [1].
Lacking comprehensive, homogeneous, sufficiently detailed, and univocal data on horticultural production and trade, it is difficult to give reliable figures on the diffusion and economic importance of this culture in Europe, where it is predominantly grown [1]. In recent statistics on the European market [3], chicory is often included under the general heading "salads" or considered together with lettuce, which is by far the most important leafy vegetable on both a European and worldwide scale. On the basis of accessible data [3], however, it is possible to determine that chicory is produced almost exclusively by Belgium, France, Italy, and the Netherlands [3]. Although chicory does not contribute greatly to each country's total agricultural income, in the north eastern regions of Italy, it accounts for 87 % of the national acreage and 84 % of the national production of the red or variegated chicory known as "Radicchio", which traditionally includes all the cultivar groups with leaf commercial products. This particular type of chicory is now receiving greater attention in Europe and the USA, where its cultivation originated several years ago, and is becoming increasingly subjected to evaluation because its red or variegated leaves are appreciated as a component of ready-to-eat salads [1].
The materials grown are generally represented by actively cultivated local populations (i.e., landraces) with high variation and adaptation to the natural and anthropological environment where they originated, and have been yearly selected and multiplied by farmers [1]. These populations are maintained by farmers through phenotypical selection based on their own criteria and occasionally on the exploitation of controlled hybridizations of different types to obtain recombinant genotypes that exhibit superior agronomic and commercial traits [4]. However, conventional plant breeding methods for hybridizing and selecting plants on the basis of observed phenotypes are not the only methods used by plant breeders. Currently, molecular genetics and biotechnology are widely utilized in breeding programs of the vast majority of crop plant species. Indeed, molecular markers are nowadays essential tools to select pure or inbred lines with qualitative traits of agricultural interest by marker-assisted selection (MAS) and also to predict the specific combining ability of parental lines to breed F1 hybrid varieties in marker-assisted breeding (MAB) schemes. In leaf chicory, molecular markers can find utility for assessing the degree of homozygosity of parental inbred lines, as a measure of their genetic stability, and also for predicting the degree of heterozygosity of their F1 hybrids, as an estimate of the specific combining ability on the basis of the genetic diversity between maternal and paternal inbred lines.
Here we describe a method for genotyping elite breeding stocks-inbred lines-using microsatellites, or SSR (simple sequence repeat) markers. Among the different PCR-derived molecular systems, SSR markers are suitable for population genetics studies and marker-assisted selection programs because they have several desirable features, such as a high level of reproducibility, co-dominant heredity, and no need for high throughput technology [5]. Moreover, they provide loci for simple and accurate individual typing in any species and display a high level of polymorphism and widespread distribution in the genomes [6]. Differently from other single-locus marker systems, like for instance SNP (single nucleotide polymorphisms) markers, microsatellite-based markers require a much less preliminary genomic information and bioinformatic characterization for their exploitation in a given species [7].
A large-scale application of molecular marker techniques, including amplified fragment length polymorphism (AFLP) and random amplified polymorphic DNA (RAPD), were used to construct the first genetic linkage maps of C. intybus [8,9]. In 2010, a new genetic linkage map for C. intybus was constructed by using SSR markers [10]. This consensus genetic map, which includes nine homologous linkage groups (LGs), was obtained after the integration and ordination of the molecular marker data deriving from one witloof chicory and two industrial chicory progenies.
The aim of our study was to develop a method for the genetic characterization of elite inbred lines of the "Red of Chioggia" chicory using mapped SSR markers with a particular emphasis on the assessment of the genetic stability within (i.e., observed degree of homozygosity) and genetic diversity between paternal and maternal lines (i.e, expected degree of heterozygosity of their F1 hybrids).
Information derived from the application of this method should then be exploited for planning crosses and predicting plant vigor traits (i.e., heterosis) of experimental F1 hybrids of leaf chicory on the basis of the genetic distance and allelic divergence between parental inbred lines. Knowing the parental genotypes would allow us not only to protect newly registered varieties but also to assess the genetic purity and identity of the seed stocks of commercial F1 hybrids, and to certificate the origin of their food derivatives.
Basic genetic variation and differentiation statistics computed for single locus, across LGs and plant accessions are presented and discussed. Overall data support an efficient method for assessing a multi-locus genotype of plant individuals and lineages, which can also be combined with pedigree notes on a panel of morpho-phenological traits for breeding F1 hybrid varieties in the Radicchio of Chioggia biotype.

Results
PCR-based amplifications of the genomic DNA samples from all inbred lines were performed to assay 27 mapped loci (information on primer pairs are reported in Additional file 1).
Descriptive statistics over all the SSR loci, along with information on the genetic diversity found across the molecular markers and plant accessions, are reported in Tables 1 and 2, respectively. The mean number of observed marker alleles (n a ) in the SSR loci assayed was 5.9, varying from 3.3 in LG5 and LG7 to 9.3 in LG2 ( Table 1). The frequency of the most common marker allele (p i ) proved to be low when the observed number of marker alleles was high and vice versa (for instance, in LG2 and LG5, where the average p i was 0.386 and 0.700, respectively). At the same time, both the expected heterozygosity (He) and Shannon's information index of phenotypic diversity (I) were estimated to be high when the observed number of marker alleles was high (for instance, in LG2, where the average p i was 0.749 and 1.610, respectively) (for additional statistics, see Table 1).
The observed homozygosity scores were high, as expected for inbred lines, with a mean estimate of 0.793 (st. dev. = 0.120), and ranged from 0.521 to 0.993. The marker loci M3.8, M4.11b and M8.22, with observed heterozygosity values of 0.007, 0.017 and 0.021 (i.e., homozygosity rates of 0.993, 0.983, 0.979), respectively, greatly contributed to this average homozygosity. Wright's inbreeding coefficients (F-statistics) for single marker loci were also computed ( Table 1). The inbreeding coefficient calculated for individual accessions revealed a negative value, on average equal to F is = −0.125, as shown in Table 1. This feature was shared by 22 of the 27 SSR marker loci investigated, and it was particularly evident for the marker locus M7.19, which scored a very low observed homozygosity of 0.521. Values of the Wright's fixation index, which were computed for each locus across LGs, are reported in Table 1. The average value was F st = 0.659.
Estimates of gene flow (N m ) were also computed for each locus ( Table 1). The calculated values were slightly N m > 0 for the vast majority of the assayed marker loci, ranging from a minimum of 0.001 to a maximum of 0.352, with an average value equal to N m = 0.278.
Regarding the descriptive statistics over all the accessions, the number of polymorphic loci among individuals within inbred lines varied from 5 (18.5 %) to 23 (85.2 %) out of the total of 27 marker loci ( Table 2). Simple matching coefficients scored values greater than 0.900 in all the accessions and were equal to 1.000 in eight. The observed homozygosity was on average high, with values greater than 0.800 in the majority of the inbred lines (ranging between 0.539 and 0.898). Conversely, the expected heterozygosity was on average law, varying from 0.098 to 0.433 (Table 2).
Principal coordinate analysis allowed for the definition of centroids for all the lines. The first two principal components explained 30.37 % of the total genetic variation found within the analyzed lines. Specifically, the first and second components explained more than 18 % and about 12 % of the total genetic diversity, respectively.
Each inbred line, if regarded as a single centroid determined according to the mean genetic similarity (MGS) estimates, can be discriminated from the others on the basis of genotyping data, as shown in Fig. 1. A clear subgrouping of inbred lines as single centroids was obtained in the four main quadrants when the individuals belonging to each accession were plotted bidimentionally according to the principal coordinates.
A neighbor-joining (NJ) tree was also constructed on the basis of the genetic dissimilarity matrix whose mean coefficients were computed between all possible pairwise combinations of inbred lines of the core collection ( Fig. 2). Two main subgroups of branches with most of the accessions were generated each including about half of the inbred lines. Moreover, one of these sub-groups was further split into additional sub-nodes and two welldefined clusters wherein several inbred lines could be ordered. A few inbred lines were positioned apart from the main tree (Fig. 2).
The population structure was investigated using the ΔK method [11], which enabled to discover two levels of genetic grouping for the inbred lines (Additional file 2). When the number of accession units (K) was set to three (ΔK = 25), as many as 22 (59 %) of the inbred lines assayed in this study were grouped in a single main cluster, while two additional small clusters were formed each represented by six (16 %) inbred lines. Only four (11 %) genotypes displayed an admixed ancestry (i.e., membership <70 %), as expected with the occurrence of genetic recombination and hybridization.
A second level of genetic structure was investigated within groups of inbred lines (Additional file 2) by setting the number of accession units (K) to 24 (ΔK = 19). In this case, the whole collection of inbred lines was fragmented in clusters composed by one or few accessions (Fig. 3). In particular, the inbred lines S02, S03 and P04 shared high membership values to a single cluster, as well as the inbred lines SE111S6, SE111S7, SEG1111, and Z33, S31S3, S31S3.2, S31S3.4 were divided in two well-defined clusters. Four additional clusters were formed by couple of inbred lines sharing high membership to the same group. Overall, the proportion of individual genotypes

Table 1 Descriptive statistics of the SSR marker loci
The sample size of individual genotypes (N), the frequency of the most common marker allele (p i ), estimates of Shannon's information index of phenotypic diversity (I), the average number of observed alleles (n a ) and the effective number of alleles (n e ) per locus, the observed heterozygosity (Ho), the expected heterozygosity computed using Levene (He), the average heterozygosity (Ha), Wright's inbreeding coefficients F is and F it , the fixation index (F st ), and gene flow (N m ) that displayed admixed ancestry with membership to multiple clusters was equal to 8 %.

Discussion
In this study, we developed a method for genotyping elite breeding stocks of leaf chicory (Cichorium intybus L.) by assaying microsatellite marker loci selected for the linkage map position and polymorphism information content. The genetic identity and stability of 37 inbred lines and the extent of their genetic diversity, expressed as genetic distance and allelic divergence, were addressed by using a panel of neutral SSR markers. Furthermore,

Table 2 Descriptive statistics over all the accessions
The number of polymorphic loci (nPl), their frequency presented as percentage (%) of polymorphic loci on a total of 27 assayed, estimates of Shannon's information index of phenotypic diversity (I), the genetic similarity coefficient or simple matching coefficient (SM), the average number of observed alleles (n a ) and the effective number of alleles (n e ), the observed heterozygosity (Ho), the expected heterozygosity computed using Levene (He), the average heterozygosity (Ha), Wright's inbreeding coefficients F is and F it and the fixation index (F st ) LGs) proved to be very informative for the study of the population structure and inbreeding level of C. intybus plant materials. Along with estimates of observed homozygosity, the genetic stability of each breeding stock was addressed by computing Shannon's estimates of phenotypic diversity and Rohlf 's coefficients of genetic similarity. On the whole, these statistics indicated strong genetic uniformity for the investigated inbred lines, as expected for breeding stocks developed by selfing and full-sibling programs. Furthermore, a marked inbreeding for the majority of the accessions was supported by the high homozygosity observed in nearly all the inbred lines (the mean value was as high as 78.7 %). Also, the Wright's fixation index indicated that the genetic differentiation between the inbred lines is high (approximately 63 %) and that one-third of the genetic variation (approximately 33 %) is occurring within the inbred lines, due not only to homozygosity for different marker alleles but also for heterozygosity at some of the marker loci. Our data demonstrate that most of the genetic differentiation is occurring among inbred lines; thus, each breeding stock can be considered as genetically uniform and distinguished from the others of the core collection. It is worth mentioning that individual inbreeding coefficients, as estimate of the strength of inbreeding for single inbred lines, were shown to be low or negative, indicating that the observed heterozygosity was greater than expected. Maintenance of such levels of heterozygosity in spite of inbreeding reproductive strategies (i.e., selfing, full-sibling and back-crossing) could be a consequence of the reproductive system of C. intybus, which is naturally characterized by high frequency of allogamy as a result of self-incompatibility. We may also speculate that a fraction of the observed heterozygosity could be a consequence of phenotypic selection (i.e., morphologically superior individuals) operated by breeders during inbreeding programs.

General statistics H-statistics F-statistics
All mapped SSR markers exploited in this multi-locus DNA genotyping method scored high polymorphism information content, with the exception of marker M8.22 that revealed an almost monomorphic condition. Although its very low or null discriminant ability, this marker locus was taken into account as it showed an allele-specific genotype, which is typical of leaf chicory and allows to identify Radicchio from other C. intybus types (e.g., Witloof ).
Regarding the NJ clustering results, the inbred lines known to be genetically related (i.e., inbred lines that originated from the same local variety) proved to form a very well-defined subgroups of the tree (e.g., accessions SE111 and S31). The STRUCTURE analysis of the population of genotypes (for K = 25) revealed clusters of single individuals in agreement with the grouping of inbred lines shown by the NJ tree analysis. In fact, inbred Fig. 1 The centroids of all the inbred lines expressed as MGS (mean genetic similarity) estimates plotted according to the first two main components. The red triangles refer to the seed parents, whereas the yellow dots indicate the pollen donors. The difference in size is related to the genetic variability found within each accession represented by the standard deviation of the simple matching (SM) coefficient lines P04, S02 and S03 were grouped in the same cluster of ancestry and associated to the same node of the tree. This is also true for several pollen donor lines, such as SEG111, SE111S6 and SE111S7, and Z34 and SC24. In addition, the inbred lines belonging to the minor clade of the NJ tree were all grouped in different STRUCTURE clusters (see Figs. 2, 3 for details). As far as the seed parent lines (13,20,11,38,86,17 and 49), they were grouped into four different clusters as expected on the basis of their variety of origin. Interestingly, the inbred line 13, which resulted to have an admixed ancestry, originates from a specific introgression and backcross program.
PCA allowed for the definition of centroids for all the inbred lines. The first component was positively associated with cycle length, discriminating long cycle accessions from short cycle accessions, with only a very few exceptions per class. It is also worth mentioning that the centroids of inbred lines with a common origin could be plotted in different areas of the quadrants. This finding suggests that it is possible to develop and select genotypically different inbred lines (i.e., homozygous of different alleles at the same loci) starting from individuals selected within a given local variety of leaf chicory, namely Radicchio of Chioggia.

Conclusions
Our research deals with the implementation and validation of a multi-locus genotyping system in leaf chicory that may prove to be useful for the marker-assisted breeding of new varieties of Radicchio. In particular, the plant materials used in this study cover a core collection of Radicchio of Chioggia (i.e., "Red of Chioggia" biotype) experimental materials, which not only manifest valuable traits, but also possess applicable uses in modern breeding programs.
From a technical point of view, labeling each set of primers with different fluorescent dyes allowed us to differentiate and score up to eight SSR marker loci in a single Genescan ® run. One important advantage of this method is a substantial cost savings for fluorescent primer labeling, because the synthesis of a specific fluorescently labeled primer for each SSR marker locus is not needed. The multiplex-ready PCR required only four fluorescent-dye labeled primers to complete the research analyses. Furthermore, multiplex-ready PCR combines both the advantages of the M13-tailed primer method [12] and multiplex PCR [13] for fluorescent-based SSR genotyping of single individuals. The use of the M13 primer has several advantages over other techniques. First, it allows for working with a unique tail sequence and avoiding the need of using the requirement of several different SSR dye labeled primers. In addition, the technique has the further advantage of being less time consuming and reducing consumable costs. For multiplex purposes, it is only necessary to change the fluorescent colors to label the different PCR products of each SSR marker locus. We were therefore able to reconstruct the genotype of each individual across all the accessions for as many as 27 target loci (3 selected marker loci for each of the 9 LGs) by performing 14 PCR reactions and 4 Genescan ® runs.
In conclusion, we successfully developed and implemented an efficient and reproducible method for the multi-locus genotyping of elite breeding stocks of leaf chicory belonging to the Radicchio of Chioggia biotype using 27 microsatellite marker loci scattered throughout the genome. We demonstrated that this method is useful for assessing the homozygosity and genetic stability of single inbred lines and for measuring the specific combining ability between maternal and paternal inbred lines on the basis of their genetic diversity. This information could be exploited for planning crosses and predicting the heterosis of experimental F1 hybrids on the basis of the allelic divergence and genetic distance of the parental lines. Knowing the parental genotypes would enable not only to protect newly registered varieties but also to assess the genetic purity and identity of the seed stocks of commercial F1 hybrids, and to certificate the origin of their food derivatives.

Plant materials and DNA isolation
Plant materials of the "Red of Chioggia" biotype, belonging to C. intybus subsp. intybus var. foliosum L. (Table 3), were developed and provided by T&T Produce (Sant' Anna di Chioggia, Venice, Italy). Most of the inbred lines were represented by pollen donors spanning from S3 to S7 obtained by repeated selfing of single individuals chosen within each progeny at both genotype and phenotype levels (see upper part of Table 3). Lines coded as IS3 were partial inbreds derived from intercrossing closely related S3 progeny plants. A few male-sterile seed parents were selected within F2 progenies obtained by selfing of F1 individuals, FS1 progenies produced by fullsibling and S1BC1 progenies generated by selfing of BC1

Table 3 Plant materials
Information on the plant materials, including the inbred line ID, the number of individuals assayed per population, the inbred level reached per each line and the cycle of the variety each line derives from, expressed in days after transplanting. The number of generations of selfing (S) is reported for pollen donors, whereas full-sibling (FS), back-crossing (BC), pair-wise crossing (F) between inbred lines or inter-crossing between selfed individuals (IS) refers to seed parents. In addition, male inbred lines were multiplied in vivo by seeds (i.e., inbreeding) and female inbred lines were propagated in vitro by cuttings (i.e., cloning) individuals (see lower part of Table 3). Concerning the strategy of sampling, the pollen parents were represented by eight individuals per accession, with few exceptions, whereas the seed parents were propagated by in vitro culture and represented each by replicated individuals of clonal lines. All plant materials were bred in an experimental station at Chioggia (Venice, Italy) under controlled pollination conditions. The genomic DNA was extracted from 100 mg of fresh leaves with the GenElute ™ Plant Genomic DNA Miniprep Kit (Sigma-Aldrich, http://www.sigmaaldrich.com) following the manufacturer's instructions. The quality of the DNA samples was assessed by electrophoresis on 1 % (w/v) agarose gel stained with 1X SYBR ® Safe ™ DNA Gel Stain (Life Technologies) in Tris-Acetate-EDTA (TAE) running buffer. The yield and purity of the extracted genomic DNA samples were evaluated using a NanoDrop 2000c UV-Vis Spectrophotometer (Thermo Scientific). Following DNA quantification, all the DNA samples were diluted to a final concentration of 25 ng/μl to be used as template for PCR amplifications.

Amplification of SSR loci
A total of 27 SSR marker loci were selected among those mapped in the nine basic LGs constructed for C. intybus [10]. In particular, three SSR loci were carefully chosen for each LG in order to select the best ones in terms of polymorphism information content (PIC) scores and also to be well scattered throughout the genetic map (Fig. 4). The amplification of microsatellites was performed by using a PCR multiplex assay and the detection of DNA fragments across marker loci was achieved using a 5′ M13-tailed primer method [12] with some modifications. Only one dye-labeled M13 primer was used per PCR reaction in combination with any other M13-tailed forward primer [14]. The SSR motifs and primers used in this study are described in Additional file 1.
Amplification reactions were set in order to analyze two marker loci with the same fluorescent dye in each PCR experiment [14]. In general, the marker alleles produced for different target loci, assayed with the same fluorescent dye labels, were characterized by distinct amplicons deriving from PCRs of similar efficiency, as