Complete mitochondrial DNA sequence of the European flat oyster Ostrea edulis confirms Ostreidae classification

Background Because of its typical architecture, inheritance and small size, mitochondrial (mt) DNA is widely used for phylogenetic studies. Gene order is generally conserved in most taxa although some groups show considerable variation. This is particularly true in the phylum Mollusca, especially in the Bivalvia. During the last few years, there have been significant increases in the number of complete mitochondrial sequences available. For bivalves, 35 complete mitochondrial genomes are now available in GenBank, a number that has more than doubled in the last three years, representing 6 families and 23 genera. In the current study, we determined the complete mtDNA sequence of O. edulis, the European flat oyster. We present an analysis of features of its gene content and genome organization in comparison with other Ostrea, Saccostrea and Crassostrea species. Results The Ostrea edulis mt genome is 16 320 bp in length and codes for 37 genes (12 protein-coding genes, 2 rRNAs and 23 tRNAs) on the same strand. As in other Ostreidae, O. edulis mt genome contains a split of the rrnL gene and a duplication of trnM. The tRNA gene set of O. edulis, Ostrea denselamellosa and Crassostrea virginica are identical in having 23 tRNA genes, in contrast to Asian oysters, which have 25 tRNA genes (except for C. ariakensis with 24). O. edulis and O. denselamellosa share the same gene order, but differ from other Ostreidae and are closer to Crassostrea than to Saccostrea. Phylogenetic analyses reinforce the taxonomic classification of the 3 families Ostreidae, Mytilidae and Pectinidae. Within the Ostreidae family the results also reveal a closer relationship between Ostrea and Saccostrea than between Ostrea and Crassostrea. Conclusions Ostrea edulis mitogenomic analyses show a high level of conservation within the genus Ostrea, whereas they show a high level of variation within the Ostreidae family. These features provide useful information for further evolutionary analysis of oyster mitogenomes.


Background
Because of its typical architecture, inheritance and small size, animal mitochondrial (mt) DNA is widely used for phylogenetic studies. Combined with these characteristics, its typically maternal inheritance contributes to a fast rate of evolution. Nucleotide changes combined with gene order and rearrangement data can provide valuable information on major evolutionary changes at different taxonomic levels. Typically, animal mtDNA is a compact molecule (14 to 17 kb), though some mtDNA can be vastly larger (e.g., Plactopecten magellanicus [1]), and usually encodes 13 proteins, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs) [2]. There are often few intergenic nucleotides except for a single large non-coding region generally thought to contain elements that control the initiation of replication and transcription [3]. Size variation in mtDNA is usually due to the different length of the non-coding regions. Gene order is generally conserved in most taxa, although some groups show considerable variation. This is particularly so in the Mollusca phylum, especially in Bivalvia and Scaphopoda [4]. In addition to the fact that phylogenetic relationships among major molluscan groups are not well understood, the species classification of some of the most common mollusks remains difficult.
A case in point is oysters, for which a plastic growth pattern is a major feature, resulting in a wide range of overlapping ecophenotypic variants [5,6]. Oysters are bivalve molluscs that are widely distributed in the world's oceans. As benthic, sessile filter-feeders, they play an important role in estuarine ecosystems. Moreover, some species are of economic importance, like the Pacific cupped oyster, which is grown in 27 countries and is the most highly produced mollusc species in the world. Oysters have been introduced all over the world for culture and many species are sympatric. Numerous species (30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40) according to the classifications) of oysters of the genus Ostrea have been described. Their geographical range is particularly wide in warm and temperate waters of all oceans, although they have a predominantly tropical distribution [6,7]. In Europe, along the Atlantic and Mediterranean coasts, the European flat oyster, Ostrea edulis, is an important economic marine resource: in 2009 almost 3000 tons were produced in the world, mainly (91%) in Europe (Spain, France, Ireland ...) [8].
During the last few years, there have been significant increases in the number of complete mitochondrial sequences available for all species. The number has more than doubled for molluscs in the last three years [9], so that 98 complete mollusk mitochondrial genomes are now available in GenBank, mainly from gastropods (43), bivalves (35) and cephalopods (14). Among bivalves, the sequenced genomes represent 6 families and 23 genera. In the Ostreidae, the genus Crassostrea has been thoroughly studied, with 7 representatives (6 Asian oysters and 1 American oyster) [10]. In contrast, there is only one representative of the genus Saccostrea (Saccostrea mordax), and one of the genus Ostrea (Ostrea denselamellosa). Recent studies have provided a more comprehensive picture of the cupped oyster genome, showing an unusually high conservation of mitochondrial gene order in Asian Crassostrea species [11]. Even though molecular tools, such as mitochondrial or microsatellite markers, already exist for the European flat oyster and allow population genetics [12] or quantitative genetics [13] studies, the complete characterization of its mtDNA will allow a better study to be made of phylogenetic relationships among members of the genus, especially between the closely-related species O. edulis and O. angasi [14], to improve classification of the Ostreidae family within the Bivalvia.

Genome composition
The complete mitochondrial genome of Ostrea edulis [GenBank: JF274008] is 16 320 nt in length and encodes 37 genes, including 12 protein-coding genes (PCGs), 2 rRNAs and 23 tRNAs on the same strand ( Figure 1 and Table 1). This size is very close to that of O. denselamellosa (16 277 bp), shorter than that of other Ostreidae (16 532 bp for S. mordax to 22 446 bp for C. iredalei), and is within the size range of the Pteriomorphia mt genomes published to date: from 16 211 nt for Argopecten irradians [15] to 32 115 nt for Placopecten magellanicus [1].
In the mt genome of O. edulis, a total of 965 bp of non-coding nucleotides is spread over 21 intergenic regions (each over 5 bp) including a major non-coding region (MNR) of 695 bp. A large non-coding region suggests a putative control region based on its AT content of 74.4% [16]. In contrast to typical animal mitochondrial genomes, the O. edulis genome may lack the protein-coding gene atp8, although some recent studies have found a good candidate for atp8 gene in Mytilidae and possibly in some Ostreidae [17]. Furthermore, O. edulis genome also has duplications of three tRNAs: trnM, trnS and trnL. The rrnL gene is split into 2 fragments, a phenomenon previously observed in the Ostreidae [11]. The rrnS is not duplicated (also in O. denselamellosa, S. mordax and C. virginica), in contrast to Asian Crassostrea.
The molecule has an overall A+T composition value of 64.9% and the size of the coding region is 15 379 nt in length, accounting for 94.2% of the whole genome. The AT content is slightly higher than those of Pectinidae (55.3 to 59.6% [18]) or Mytilidae (61.5 to 61.8%). The AT composition of O. edulis is, therefore, within the AT content range of the Ostreidae: the lowest known AT content is 60.7% in O. denselamellosa, while the highest is 65.3% in C. hongkongensis [19]. In S. mordax, the AT content is 64.4% which is very similar to O. edulis.
As observed in O. denselamellosa (16 277 bp), S. mordax (16,532 bp) and C. virginica (17 244 bp), the lack of duplicated rrnS in O. edulis, added to the lack of 2 tRNAs (not duplicated trn-K and trn-Q) compared to Asian Crassostrea may account for the difference in length compared with other Crassostrea (C. gigas 18225 bp, C. hongkongensis 18 622 bp).
The genome composition of O. edulis is, thus, identical to O. denselamellosa (except for AT composition) and close to S. mordax in terms of complete genome size, AT content and the non-duplicated rrnS. More mitochondrial genome sequences from Ostrea and Saccostrea will be needed to assess relationships between the Ostrea, Crassostrea and Saccostrea genera.
In terms of gene arrangement, it is thus clear that O. edulis is more similar to Crassostrea than to S. mordax when comparing PCGs. As shown in Figure 2, the complete genome arrangement of O. edulis is similar to that of Asian Crassostrea while it appears completely reorganized from trnY to the end of mt genome when compared with that of S. mordax.

Protein-coding genes
All PCGs are encoded on and transcribed from the same strand. Twelve open reading frames (ORFs) were detected for the thirteen typical PCGs (cox1-cox3, cytb, nad1-nad6, nad4L, atp6 and atp8). Although we carefully looked for candidate regions for atp8 gene, we could not identify any, as in all Pteriomorphia complete genomes already published. However, a recent publication [17] suggests that a putative ORF represents a good candidate to start an atp8 gene in most bivalve mt genomes. Within the invertebrate mt code there are three standard initiation codons (M-AUG, M-AUA, and I-AUU), but mt genomes often use a variety of non conventional start codons [22]. In this study, most of PCGs use conventional initiation codons: ATA is used for cox3, nad4, nad6 and atp6; ATG is used for cox1, cox2, nad1, nad3, nad4L and nad5; ATT is used for nad2, but cytb uses the alternative start codon CTA (as in C. gigas and C. angulata [10]). Eight protein-coding genes were terminated by a stop codon (TAA and TAG).

Transfer and ribosomal RNA genes
In total, 23 tRNA coding genes were identified in the size range of 63 to 71 nucleotides, based on typical secondary structure (Additional file 1). An additional trnM was detected as found in C. gigas, C. hongkongensis [9], C. virginica [16] and Mytilus [23]. Two serine and two leucine tRNA genes were also differentiated in O. The rrnL gene is split into two segments: one segment, of the 5' end (matches with rrnL 5'end from O. denselamellosa and Saccostrea), is 575 bp long and positioned between trnP and nad2; and the other segment, of the 3' end, is 708 bp and located between rrnS and nad5. The length of the rrnS is similar to that of most bivalves, but smaller than that of O. denselamellosa (1017 bp) and that of Crassostrea (946 to 1207 bp) [10]. The size of rrnL (1283 bp in all) is similar to that of O. denselamellosa (1299 bp), but smaller than that of other bivalves. This bias may be due to the method (BLAST) used to compare the rRNA sequences because this method only checks the identity between a few sequences and because it's easier to compare sequences from same species as they show higher identity.

Non-coding regions
As in most bivalves, O. edulis mtDNA contains a large number of unassigned nucleotides. There are as many as 21 non-coding regions (> 5 bp) up to 965 nucleotides found throughout the O. edulis mitochondrial genome. Eight of these non-coding regions are more than 50 bp in length. Among these regions, the major non-coding region (MNR) has been identified and located, that remains the most promising region in which to find regulatory and/or gender-specific sequences [25]. The O. edulis mtDNA MNR is positioned between trnD and cox1 and is 695 bp in length, similar to that of O. denselamellosa (689 bp), making it the longest MNR within the Ostreidae apart from C. virginica (832 bp) and C. ariakensis (716 bp). It has an A+T content of 74.4% which is higher than the remainder of the mt genome (64.4%), as it includes several (A)n and (T)n homopolymer tracts, features which are typically used for identification of the mitochondrial control region and thought to contain the replication origin [2].

Phylogenetic analysis
In recent years there have been many phylogenetic studies on the taxonomy and evolution of the Ostreidae based on molecular data, especially mitochondrial DNA [26][27][28][29][30]. However, most of these previous studies have been based on partial sequences and incomplete molecular information. Recently, Ren et al. [11] have compared 7 complete mt genomes from Asian oysters.
In the present study's aa-based tree built with twelve concatened PCGs from 19 mitochondrial genomes in Pteromorphia (Figure 3), we can observe that, at the Ostreidae level, O. edulis is first clustered with O. denselamellosa as congeneric species. Then this group of species falls into a highly supported clade with S. mordax. Ostrea and Saccostrea are then clustered with the Crassostrea species group. In this latest clade, the single American oyster C. virginica falls at the base of a nested clade that contains the Asian oysters. Very similar results were obtained with a nucleotide phylogenetic tree with low differences of bootstrap values. In Figure  4, more Ostreidae species are included as more numerous cox1 sequences are available in Genbank. The same phylogenetic relationship between Ostrea, Saccostrea, and Crassostrea is observed, especially the first grouping of Ostrea and Saccostrea, but not between Ostrea and Crassostrea, with however far less robust nodes. This same result was observed when considering the evolution of the tRNA anticodons in marine bivalve mitochondrial genomes, where the relationship presented are also based on concatenated nucleotide sequences of 12 protein-coding genes by Bayesian inference analysis [24]. However a recent study [31], based on cox1 and 16S sequences, showed a closer relationship between Ostrea and Crassostrea, than with Ostrea and Saccostrea. However, for the cox1 analysis, only one Ostrea sequence was included, and for the 16S analysis, much more Ostrea sequences were included but the bootstrap value was between 50 and 80%. Those comparisons seem to indicate that phylogenetic analyses are more powerful when including several sequences as the 12 concatened PCGs.   Finally, the phylogenetic tree presented in Figure 3, which includes mt genomes from all published Pteriomorphia, reinforces the taxonomic classification of the 3 families Ostreidae, Mytilidae and Pectinidae [32,11].

Conclusion
In conclusion, the complete mitochondrial genome of O. edulis is 16 320 bp in length. A common phenomenon is that mitogenomes of most bivalves contain two trnM genes and most metazoan mitochondria have a set of 22 tRNA, including two trnL and two trnS. However the tRNA gene sets of O. edulis, O. denselamellosa and C. virginica are identical in having 23 tRNA genes. Another important characteristic is that the rrnS gene is not duplicated in O. edulis, a feature shared with O. denselamellosa, S. mordax and C. virginica and which contrasts with Asian Crassostrea.
The phylogenetic analyses confirm the relationships between each family (Ostreidae, Mytilidae and Pectinidae), but also within each genus (Ostrea, Saccostrea and Crassostrea). Within the Ostreidae, phylogenetic analyses show that Ostrea are closer to Saccostrea than Crassostrea, although gene arrangement may show a closer relationship between Ostrea and Crassostrea, indicating that several types of information are needed to infer relationships between genome species as evolution is acting at different levels of the genomes. As many questions remain unanswered on the phylogeny of Ostreidae, especially between Ostrea and Saccostrea, it would be desirable to increase the resolution by adding samples of more taxa in order to extend molecular information among the major lineages of the Ostreidae and within the Pteriomorphia as a whole.

PCR amplification and DNA sequencing
Adductor muscle from three O. edulis collected in Quiberon Bay (Bretagne, France) was used in this study. Total genomic DNA was extracted using a Wizard ® DNA Clean-up System (Promega). The for 30 sec and 72°C for 2 min; and finally a step of 72°C for 10 min. PCR products were verified by electrophoresis (1% agarose gel) and purified using Montage ® PCR Centrifugal Filter Devices (Millipore). Purified products were then used directly as templates in cycle sequencing reactions with dyelabeled terminators (Big Dye 3.1, Applied Biosystems). Specific primers were designed and used for primer walking sequencing, which was performed for both strands of each sample on an ABI 3130XL/Genetic Analyser (ABI).

Sequence analysis and gene annotation
During the processing of large fragments and those from primer walking sequencing, regular and manual examinations were used to ensure there was reliable overlapping and correct genome assembly.
Protein-coding and ribosomal RNA genes were firstly identified using BLAST [33] searches at GenBank, and then by alignment with previously published mt genomes from species of Crassostrea, Saccostrea and other closely-related molluscs. Amino-acid sequences of protein-coding genes were inferred with ORF Finder [34] using invertebrate mitochondrial genetic code. Transfer RNAs were identified using DOGMA [35]http://dogma. ccbb.utexas.edu/, and tRNAscan-SE [36]http://selab.janelia.org/tRNAscan-SE/ using mito/chloroplast genetic code and default search mode, or setting the cove cutoff score to 1 when necessary. Assembly of the genome and gene map of the mitochondrial genome of Ostrea edulis was performed using CLC Main Workbench (CLC bio).

Phylogenetic analysis
To date, 20 Pteriomorphia mt genomes are available in GenBank [37] and we used 19 of these (excluding Argopecten irradians irradians that is very close to Argopecten irradians: 99% similarity) in our phylogenetic analysis, together with O. edulis mt genome obtained in this study ( Table 2). The blacklip abalone Haliotis rubra (Gastropoda) was used as the outgroup. The nucleotide and amino-acid sequences from all 12 PCGs (proteincoding genes) were concatenated for each genome and

Additional material
Additional file 1: The potential secondary structures of 22 tRNAs of Ostrea edulis. The duplication of methionine is named M1 and M2 respectively. Codons recognized are shown for the pairs of leucine (L1 and L2) and serine (S1 and S2).
Additional file 2: Primers used for amplification of 4 large fragments in mitochondrial genome of Ostrea edulis.