Fast and accurate identification of cryptic and sympatric mayfly species of the Baetis rhodani group

Objective Species of the Baetis rhodani group are among the most widespread mayflies of the Palearctic region. However, frequent occurrence of morphologically cryptic species complicates the identification of sympatric species. Here, we proposed and tested a method for the fast, accurate, and cost-effective assignment of a large number of individuals to their putative species, based on high resolution melting profiles of a standard mitochondrial gene fragment. We tested this method using a system of three recently identified cryptic species inhabiting the Tyrrhenian Islands (western Mediterranean basin). Results Highly species-specific high resolution melting profiles were obtained, allowing the unequivocal attribution of each individual to the respective species. This assay provides a convenient and easily customizable alternative to traditional barcoding approaches, provided that the mayfly taxa occurring within the geographic area of interest have been previously identified and their high resolution melting profiles assessed. Electronic supplementary material The online version of this article (10.1186/s13104-017-3115-6) contains supplementary material, which is available to authorized users.


Introduction
Baetis rhodani (Pictet 1843) is a complex of morphologically cryptic mayfly species widespread in the Palearctic region [1]. Despite a considerable research effort in recent years [2][3][4][5], current knowledge of the phylogenetic relationships within this group and the taxonomic status of several of its representatives remain incomplete, with new species emerging as new areas of the Palearctic are investigated.
Methods of species recognition and individual assignment in mayflies have been mostly based on two distinct approaches, each one with its own advantages and drawbacks. The first and more traditional approach is based on the analysis of morphological diagnostic traits, primarily of the larval stages [6,7]. Although relatively inexpensive and time-saving, this approach relies entirely on an expert knowledge of the taxonomic group under study. Most importantly, with the advent of the second approach, molecular taxonomy, morphological approaches have soon appeared largely to underestimate the amount of species diversity within baetidae mayflies, including B. rhodani [2]. Indeed, under the single morphological species B. rhodani, several deeply divergent species have recently been recognized using molecular markers [2][3][4][5], and even more are likely to be described in the near future. On the other hand, molecular methods, such as those based on DNA barcodes, entail unprecedented processing costs per sample, and their extensive use for the preliminary phase of species delimitation has been largely debated (see [8,9]).
Recently, three endemic species belonging to the B. rhodani species group have been identified within the Tyrrhenian Islands, based on a multi-locus analysis of genetic variation among 112 individuals from 28 populations [5]. However, to obtain a careful characterization of the respective distribution ranges, patterns of co-occurrence on single islands, relative abundance in sites of cooccurrence, as well as for all subsequent ecological and Open Access BMC Research Notes *Correspondence: bisconti@unitus.it Dipartimento di Scienze Ecologiche e Biologiche, Università degli Studi della Tuscia, Viale dell'Università s.n.c., 01100 Viterbo, Italy evolutionary investigations on these species, the characterization of a much larger sample of individuals and populations will be mandatory. Since nuclear and mitochondrial patterns of variation were largely congruent to one another in this group of species [5], the use of a DNA barcoding approach would be fully suitable.
We developed and validated a method for the assignment of a large number of individual mayflies to their putative species, based on the analysis of the high-resolution melting (HRM) curve of a standard molecular marker (NADH dehydrogenase subunit 1 gene). HRM is an effective, still not fully exploited, acquisition in the molecular taxonomy toolbox (see e.g. [10,11]), which allows to assess the occurrence of sequence differences of diagnostic value between taxa, based on melting temperature profiles and the associated fluorescence peaks. Although based on a DNA barcoding approach, this method is faster, more cost-effective, and equally accurate when compared with traditional sequence-based approaches, provided that the pattern of variation within the focal taxa has already been characterized, as is the case for the B. rhodani group on the Tyrrhenian Islands [5].

Methods
In total, 399 larvae-100 individuals from a previous study [5] and 299 new samples-of the Baetis rhodani species group were collected from 59 localities ( Table 1, Fig. 1) spanning the three main Tyrrhenian Islands (Sardinia, Corsica and Elba islands; no ethical approval is needed for sampling or processing these invertebrate species). The larvae were preserved in 95% ETOH prior to DNA extraction. A formal species description of the three species occurring in the study area is still pending. For consistency with the previous work which identified them as distinct biological species, we will refer to them here as Species A, Species B, and Species C. An unequivocal species assignment was made possible by the deep mitochondrial sequence divergence observed between the three species. Indeed, the percent sequence divergence between species (p-distance) was as follows [5]: A vs B = 14%, A vs C = 29%, and B vs C = 26%.
Real time PCRs were run using a Roche LightCycler ® LC480. The PCR cycling conditions were as follows: an initial step of 5 min at 95 °C, 35 cycles of 1 min at 94 °C, 45 s at 57 °C, and 90 s at 72 °C, and an extension step of 10 min at 72 °C. A final melting cycle was performed for 3 min at 95 °C and a melt from 40 to 95 °C collecting fluorescence continuously at a ramping rate of 0.1 °C per second. The HRM peaks obtained from single individuals were examined after each real-time PCR using the LightCycler ® 480 Software (release version 1.5.0).
In order to characterize the HRM curve profiles of each species, we included in the analyses all the individuals belonging to the three species analysed in the previous study (n = 100; [5]), where the diagnostic value of sequence variation among the three species was assessed. To further test the accuracy of the proposed method, we randomly selected and sequenced 24 individuals from the pool of newly screened samples (Genebank Accession Numbers: MG581893-MG581916). We then compared their species assignment based on HRM profiles to those achieved through standard sequencing.
In order to evaluate the diagnostic ability of the HRM curve method for the three species studied, we performed a principal component analysis (PCA) using the IBM SPSS v. 23 (IBM SPSS Statistics for Windows, Armonk, NY: IBM Corp), based on peak temperature and fluorescence values of the 100 individuals from the previous paper [5]. The first principal component was then used to perform a receiver operating characteristic (ROC) curve analysis, whose accuracy was evaluated by means of the area under the curve (AUC) values.

Results and discussion
The derived HRM curves obtained for the individuals already sequenced from a previous study clearly showed three distinct profiles (see Fig. 2a). A comparison of these profiles with the standard sequence-based species identification unequivocally indicated the diagnostic value of the HRM curves. This facilitated assigning each unique profile to one of the three species, without misidentifications.
The three HRM curves observed within the negative first-derivative plot were easily distinguishable from one another, based on curve shape and melting peak values (Fig. 2a). The curve identifying Species A showed two peaks, the first at 83 °C with a maximum value of 17.3 and a minimum value of 12.8 (−dF/dT), and the second peak at 86 °C with a maximum value of 20.5 and a minimum value of 8.6 (−dF/dT). The HRM curve for Species B was diagnosable by a single peak at 83-84 °C with a maximum value of 28.4 and a minimum value of 19.7 (−dF/ dT). Finally, the HRM profile for Species C showed both the highest and narrowest curve profile, with a melting peak at 82 °C and a maximum value of 41.6 and a minimum value of 29.6 (−dF/dT). As shown by the scatterplot in Fig. 2b (see also Additional file 1), inter-individual variation within each species did not affect the diagnostic significance of the individual peak values.
The analysis of the 399 samples collected throughout the Tyrrhenian Islands (Sardinia, Corsica and Elba islands) did not reveal any new patterns for the HRM curve profiles in addition to the three described above. We assigned all analysed samples to a putative species: 128 to Species A, 170 to Species B, and 101 to Species C. As expected, based on the absence of new HRM curve variants among the newly studied samples, sequencebased assignment of all 24 individuals used as the control perfectly matched species identity as defined by melting profiles. Moreover, the ROC curve analysis based on the first principal component (variance explained: 82%) strongly supported the diagnostic ability of HRM method used, showing extremely high values of the AUC, in all pairwise comparisons between species (range of the AUC values: 0.97-1.00). Our results clearly demonstrated that the HRM-based method proposed here to assess species identity for mayflies of the B. rhodani group on the Tyrrhenian Islands, proved as accurate and reliable as the standard sequencebased approach, besides allowing to avoid sequencing costs and efforts.
We utilized this method for a fast and accurate evaluation of the actual geographic distribution of the three cryptic species, which is fundamental information for any future study employing these species as target organisms. The results of this evaluation revealed distinct distribution patterns for the three studied species (Fig. 1), allowing us to incorporate them within a long-term study on the evolution of the insular biota (see also [13][14][15][16][17][18][19], to investigate species-specific eco-evolutionary dynamics, and to improve ongoing bio-monitoring programs in the area [20].  Table 1. Pie diagrams show the proportion of individuals belonging to the three species within each sampled locality. (The digital elevation model was downloaded by the WorldClim 1.4 database, freely available at www.worldclim.org)

Limitations
• The method does not provide information about genetic diversity, population structure, or phylogenetic relationships of the study organisms. • The method is only applicable when background information on the interspecific genetic variation is available for diagnostic markers. Temperature °C a b Fig. 2 High resolution melting curve analysis. a A negative first-derivative plot of fluorescence over temperature for three representative individuals of Species A, B, and C (as defined by [5]); redrawn from LightCycler 480 Software. b A 2D scatterplot of melting temperatures at fluorescence peaks for the 100 individuals used to assess the diagnostic value of melting curves for each species