Development of microsatellite markers for three at risk tiger beetles Cicindela dorsalis dorsalis, C. d. media, and C. puritana

Objective Tiger beetles inhabiting sandy beaches and cliffs along the east coast of the United States are facing increasing habitat loss due to erosion, urbanization, and sea level rise. The northeastern beach tiger beetle Cicindela dorsalis dorsalis and Puritan tiger beetle Cicindela puritana are both listed as threatened under the Endangered Species Act of 1973, while the white beach tiger beetle Cicindela dorsalis media is not listed but has been declining. Extirpation of these beetles, in some cases from entire states, has isolated many populations reducing gene flow and elevating the risk for the loss of genetic variation. To facilitate investigations of population genetic structure, we developed suites of microsatellite loci for conservation genetic studies. Results Shotgun genomic sequencing of all species identified thousands of candidate microsatellite loci, among which 17 loci were optimized and verified to cross-amplify within C. d. media and C. d. dorsalis, and eight separate loci were optimized for C. puritana. Most loci conformed to Hardy–Weinberg equilibrium, showed no evidence of linkage disequilibrium or null alleles, and revealed population genetic characteristics informative for natural resource managers among the populations tested.


Introduction
Tiger beetles of the genus Cicindela are large diurnal predatory insects that tend to prefer sandy habitats near bodies of water such as river edges, and coastal beaches [1]. Many species along the North American Atlantic coast are declining due to the destruction of adult and larval beach habitat through increased development and recreational use, erosion, and sea level rise. The federally threatened northeastern beach tiger beetle Cicindela dorsalis dorsalis, which once was described as occurring in great swarms along beaches from Martha's Vineyard, Massachusetts (MA) to New Jersey (NJ), and a common the extent of gene flow, genetic diversity, and existence of metapopulations.

Methods
Multiple genomic shotgun DNA libraries of single individuals and pooled conspecifics were prepared from C. d. media, C. d. dorsalis, and C. puritana collected from throughout their native range. All samples were collected by the USFWS and provided to the U.S. Geological Survey (USGS) Leetown Science Center as whole beetles preserved in 95% ethanol. DNA was extracted from the head of each individual beetle using the DNEasy Blood and Tissue Kit (Qiagen, Germantown, MD). DNA was quantified using a Nanodrop spectrophotometer (Ther-moFisher Scientific, Frederick, MD), and used for construction of libraries for Ion Torrent PGM sequencing. Sequence reads were generated from C. puritana (n = 1), C. d. media (n = 1), and C. d. dorsalis (n = 7) among 11 Ion Torrent sequencing chips. An additional library was sequenced on a 454 Junior for n = 1 C. d. dorsalis. All sequencing was performed at the USGS Leetown Science Center, Kearneysville, WV.
All sequence reads were imported into Qiagen CLC Genomics Workbench (ver 6.5.1). Quality and length trimming were performed with the following settings: ambiguous limit = 2, ambiguous trim = yes, quality limit = 0.015, minimum number of nucleotides in reads = 20, discard short reads = yes, remove 5′ or 3′ nucleotides = no. All quality trimmed C. d. media and C. d. dorsalis reads were concatenated into one file, and all quality trimmed C. puritana reads were concatenated into a separate file. We pooled the C. d. media and C. d. dorsalis samples since they are closely related subspecies, and microsatellite loci from one sub-species would have a high chance of success for cross-amplification in the other. Each fasta file was screened for di-, tri-, tetra-, penta-, and hexanucleotide microsatellite repeat motifs in the program QDD [4]. Settings for QDD included searching for a minimum of five repeats per motif, and a minimum sequence length of 80. The output of QDD included thousands of candidate microsatellite loci and primers designed using the integrated PRIMER 3 code [5]. From the two lists of candidate microsatellite loci, we chose to test primers for 30 loci in C. d. media/C. d. dorsalis, and 31 loci in C. puritana. Dinucleotide loci were avoided. Each sequence with a candidate microsatellite was blasted against the NCBI nt database, and none with any match to nt had strong similarity to organisms other than insects. Microsatellite loci were initially screened individually using M13 tailed primers [6]. Polymerase chain reactions were performed in 25 μl volumes, consisting of 10 ng of DNA, 1X PCR Buffer (Promega, "Locus" refers to the name assigned to the microsatellite containing sequence. "Size-range" is the bp size of the alleles genotyped. "Motif" is the repeat motif and number of repeats (in parentheses) identified from the sequence read in QDD. "Multi-plex" refers to the assignment of each locus to one of 4 multiplex PCR reactions along with the fluorophore used. See "Main text" for PCR conditions. "Locus origin" denotes whether the locus was derived from within Cicindela dorsalis media (Cdm) or C. d. dorsalis (Cdd) genomic sequence data, though all loci amplify in both subspecies. "Microchecker null" and "Microchecker scoring error" denote the results of tests in Microchecker for null alleles or scoring errors at each locus. Exact tests in GENEPOP [10] were used to determine if the distribution of genotypes at each locus conformed to Hardy-Weinberg equilibrium (HWE). Multi-locus tests of conformance to HWE were completed using Fisher's method in Genepop. Linkage disequilibrium (LD) was tested for all pairs of loci using contingency tables in GENEPOP. All tests of HWE and LD tests in GENEPOP used the default Markov chain parameters. Significance levels for HWE and LD tests were adjusted using the sequential Bonferroni correction. To assess genetic diversity, observed and unbiased expected heterozygosity and the effective number of alleles were calculated in Genalex ver 6.5 [11,12]. Finally, to evaluate the extent of genetic differentiation among populations, we calculated pairwise F ′ ST in Genalex.  Table 1. There were no missing data. Microchecker identified locus Cdo15 as having potential scoring errors in addition to possible null alleles, while a few other loci were flagged as possibly having null alleles. There was no evidence of linkage disequilibrium among locus pairs within or among collections. Several loci were monomorphic in one of the C. dorsalis dorsalis collections, precluding tests of HWE in Genepop for these loci. All populations were out of HWE based on Fisher's method examining P-values across all loci. The most polymorphic locus was locus Cdo13 with seven alleles in C. d. media, and the number of alleles averaged across loci was higher in C. d. media at four versus approximately two in the C. d. dorsalis collections. The expected heterozygosity averaged across loci was low and similar across the three collections ranging from 0.20-0.29, and effective number of alleles was small reflecting the low levels of heterozygosity. Pair-wise estimates of genetic differentiation ( F ′ ST ) were high and statistically significant among all collections ranging from 0.334 to 0.767 (Table 2). This suggests a high level of genetic differentiation, and suitability of these loci for characterizing population structure.

Results and discussion
Complete genotypes were also obtained for the eight loci screened in two population samples of C. puritana (Table 3). Some loci were identified as having null alleles by Microchecker, but no loci were flagged as having scoring errors. All loci were polymorphic in at least one population. The Little Cove Point collection was out of HWE, while Connecticut River was in HWE based on Fisher's method examining all loci. Like for the C. d. dorsalis and C. d. media loci, some of the C. puritana loci were not sufficiently polymorphic for HWE testing in Genepop. There was no evidence of linkage disequilibrium among locus pairs or among collections. The most polymorphic locus was CpuQ2 with six alleles in the LCP collection. While the average number of alleles was similar across populations, the number of alleles at each locus was variable between populations with no consistent pattern. Both observed and expected heterozygosity, as well as the effective number of alleles were similar and low in the two populations. Pair-wise F ′ ST was large at 0.789 (P < 0.001) between the two C. puritana populations.
Overall, the results of the initial application of these loci to a small set of samples herein suggest that they will have utility for assessing population structure and patterns of gene flow in other populations of Cicindela tiger beetles. In addition, the shotgun genomic sequencing approach we employed identified thousands of candidate loci, allowing for the development of additional markers if needed.

Limitations
The number of populations and individuals examined so far is modest. Therefore, application of these microsatellite markers to additional populations of C. d. media, C. d. dorsalis, and C. puritana will reveal whether the levels of variation seen, such as a relatively small number of alleles per locus and low levels of heterozygosity, are typical among populations within these taxa. For a locus like Cdo15 in C. d. media and C. d. dorsalis identified by MICROCHECKER as having