Materials collection
Beekeeping (A. mellifera) is carried out on the roof of the second building at Musashino University Ariake Campus in the Ariake district of Kōtō-ku, on the Tokyo Bay coast (Fig. 1). A microspatula-tip full of bee pollen was obtained from an uncovered and relatively new honeycomb. Pollen was bright orange or yellow in colour, with low viscosity and low permeability. The honeybee in temperate regions including Japan has a foraging season, which is spring to autumn [9]. Samples were collected three times per month (representing early, middle, and late periods) on rain-free days during daylight hours from mid-March, 2018 to mid-October, 2018, with the exception of the mid-July sample period, because of honeybee behavioural suppression with continuous extreme heat days (Additional file 1, Fig. S1).
DNA metabarcoding analysis
After adding Lysis Solution F (Nippon Gene, Tokyo, Japan), 0.5-mm zirconia beads, and 5.0-mm stainless steel beads to the bee pollen sample, the liquid was shaken at 1500 rpm for 2 min using a Shake Master Neo (Biomedical Science, Tokyo, Japan) and was then incubated at 65 °C for 10 min. After centrifugation at 12,000×g for 10 min, supernatant was collected. Genomic DNA was extracted using an MPure Bacterial DNA Extraction Kit (MP Bio Japan, Tokyo, Japan). The extracted DNA solution was mixed with a final concentration of 10% polyvinylpolypyrrolidone and purified by collecting the supernatant after centrifugation. The DNA concentration was measured with a Synergy H1 (BioTek, Winooski, VT, USA) and a QuantiFluor dsDNA System (Promega, Madison, WI, USA).
An amplicon library targeting a part of the rbcL region of the chloroplast genome was constructed by two-step PCR. The 1st PCR primer set was customised within the barcode region as a mini-barcoding system for amplicon sequencing by short-read NGS [17]. This PCR was carried out in a total volume of 10 μl, containing 0.5 ng template DNA, 0.5 μM forward primer (5ʹ-ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTACCAGYCTTGATCGTTACAAAGG-3ʹ [underline indicates the Illumina adapter sequence]), 0.5 μM reverse primer (5ʹ-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGTAAAATCAAGTCCACCRCG-3ʹ [underline indicates the Illumina adapter sequence]), 0.2 mM dNTP mixture, 1 × company-supplied buffer, and 0.05 U ExTaq HS DNA polymerase (TaKaRa Bio, Shiga, Japan). The reaction cycles were as follows: initial denaturation at 94 °C for 2 min; 30–35 reaction cycles at 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s, and final extension at 72 °C for 5 min. After clean-up of the 1st PCR product, the 2nd PCR was performed under the same conditions, with the following modifications: a forward primer (5ʹ-AATGATACGGCGACCACCGAGATCTACAC[index 2]ACACTCTTTCCCTACACGACGC-3ʹ) and reverse primer (5ʹ-CAAGCAGAAGACGGCATACGAGAT[index 1]GTGACTGGAGTTCAGACGTGTG-3ʹ) were added based on the Illumina Adapter Sequences Document (https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf), and adjustments were made to some PCR programs (12 cycles and annealing temperature at 60 °C). After clean-up of the 2nd PCR product, the quality of the constructed library was checked using a Fragment Analyzer (Advanced Analytical Technologies, Ankeny, IA, USA) and a dsDNA 915 Reagent Kit (Advanced Analytical Technologies). Multiple libraries were pooled and sequenced by running 2 × 300-bp paired-end reads using a MiSeq platform (Illumina, San Diego, CA, USA).
A series of bioinformatic analyses is shown in Additional file 2, Fig. S2. Briefly, raw read data were cleaned by removing primer sequences using the Fastx-Toolkit version 0.0.14 (https://hannonlab.cshl.edu/fastx_toolkit/). In addition, reads with a quality score of less than 20 or length less than 40 nt were excluded using Sickle version 1.33 (https://github.com/ucdavis-bioinformatics/sickle). The clean paired reads were merged using FLASh version 1.2.11 [18]. Parameters were merged as follows: (i) minimum overlap length = 10, (ii) average read length = 230, and (iii) average fragment length = 320. Sequence dereplication, sorting by decreasing abundance, operational taxonomic unit (OTU) clustering, chimera filtering, and mapping reads back to OTUs were performed using USEARCH version 10.0.240 (https://www.drive5.com/usearch/). After reads were mapped to representative OTUs, they were normalised by counts per million (CPMs). The most abundant sequence from each OTU was selected as the representative sequence and was annotated for target species with 97% similarity against the NCBI non-redundant nucleotide ‘nt’ database using the blastn program (BLAST+ version 2.7.1) [19]. For each OTU that was annotated to different species with the same similarity score, the genus or family common to those was assigned.