Dataset of the de novo assembly and annotation of the marbled crayfish and the noble crayfish hepatopancreas transcriptomes
BMC Research Notes volume 15, Article number: 281 (2022)
Crayfish plague disease, caused by the oomycete pathogen Aphanomyces astaci represents one of the greatest risks for the biodiversity of the freshwater crayfish. This data article covers the de novo transcriptome assembly and annotation data of the noble crayfish and the marbled crayfish challenged with Ap. astaci. Following the controlled infection experiment (Francesconi et al. in Front Ecol Evol, 2021, https://doi.org/10.3389/fevo.2021.647037), we conducted a differential gene expression analysis described in (Boštjančić et al. in BMC Genom, 2022, https://doi.org/10.1186/s12864-022-08571-z)
In total, 25 noble crayfish and 30 marbled crayfish were selected. Hepatopancreas tissue was isolated, followed by RNA sequencing using the Illumina NovaSeq 6000 platform. Raw data was checked for quality with FastQC, adapter and quality trimming were conducted using Trimmomatic followed by de novo assembly with Trinity. Assembly quality was assessed with BUSCO, at 93.30% and 93.98% completeness for the noble crayfish and the marbled crayfish, respectively. Transcripts were annotated using the Dammit! pipeline and assigned to KEGG pathways. Respective transcriptome and raw datasets may be reused as the reference transcriptome assemblies for future expression studies.
Freshwater crayfish are keystone species of freshwater habitats [1,2,3]. One of the major contributors to the loss of the European freshwater crayfish biodiversity is the introduction of highly competitive North American invasive crayfish species, carriers of the devastating disease crayfish plague . This disease is caused by the oomycete pathogen, Aphanomyces astaci . The noble crayfish, an endangered emblematic species of European freshwaters is considered to be highly susceptible to the pathogen . On the other hand, the marbled crayfish, parthenogenetic species of North American origin is a known carrier of this pathogen . In the controlled infection experiment described in , the marbled crayfish has been shown to be highly resistant to two A. astaci strains of differing virulence, Haplogroup B strain (Hap B; high virulence) and Haplogroup A (Hap A; low virulence). Concurrently, in the same experimental setup the susceptibility of the noble crayfish, especially to the lethal Hap B strain was confirmed. During the experiment, individuals of both species were sampled at: 3 dpi, 21 dpi for the analysis of the gene expression patterns in the infected individuals. Results of this study are presented in .
Here, we report a large collection of RNA sequencing data (55 samples) from the hepatopancreas of the noble crayfish and the marbled crayfish, and their de novo assembled and annotated transcriptomes. This data can provide insight into the biology of these two species and will allow for future comparative transcriptomic analysis. The datasets presented here can also serve as the reference transcriptomes for the future transcriptomic studies in the marbled crayfish and the noble crayfish and development of gene specific primers and expression assays. The dataset from the noble crayfish and marbled crayfish infected with A. astaci might be interesting to molecular Biologists, immunologists, bioinformaticians, evolutionary biologists and others interested in the innate immunity of the freshwater crayfish.
The data reported here represent an RNA sequencing dataset from A. astaci infected noble crayfish and marbled crayfish individuals . Each sample represents a biological replicate, originating from a different individual. A total of 2430.7 million and 3098.2 million 2 × 150 bp paired-end reads (read depth: 36.8 M−68.9 M, mean: 48.59 M) were generated from the hepatopancreas of the noble crayfish and the marbled crayfish, respectively . After processing of low-quality reads, a total of 2227.6 million (91.64% of the initial raw reads) and 2926.8 million (94.46% of the initial raw reads) high-quality sequences were retained for the noble crayfish and the marbled crayfish, respectively . Raw read data are available at the NCBI database under SRA accession number: SRP318523 .
De novo transcriptome assembly
From the pooled Trinity de novo transcriptome assembly we obtained 670,741 transcripts for the noble crayfish (44,062 ORFs) and 11,333,173 (46,953 ORFs) transcripts for the marbled crayfish. In the post-assembly processing, after filtering fragmented transcripts 168,172 (44,062 ORFs) and 348,751 (46,953 ORFs) transcripts remained for the noble crayfish  and the marbled crayfish, respectively . After redundancy reduction with CD-HIT-EST 109,608 genes and 254,336 genes remained for the noble crayfish and the marbled crayfish, respectively. BUSCO analysis of the final assembly revealed a high level of completeness for both assemblies, 93.30% for the noble crayfish and 93.98% for the marbled crayfish arthropoda_odb10 database of orthologs (n = 1013). Comparative analysis of the BUSCO scores among available freshwater crayfish transcriptomes placed the noble crayfish and the marbled crayfish transcriptome assemblies as the most complete freshwater crayfish transcriptome assemblies to date . Length distribution of assembled transcripts varied from 401 to 32,629 in the noble crayfish and 401 to 32,816 in the marbled crayfish, with the highest number of transcripts falling in the category of 401–500 bp in length for both species . The simple sequence repeats (SSRs) unit lengths ranged from 1 to 12, with 1 bp SSRs being the most abundant in the noble crayfish assembly and 2 bp SSRs in the marbled crayfish .
Gene model building using TransDecoder predicted 67,196 and 102,871 coding regions for the noble crayfish and the marbled crayfish, respectively. In total, 46,819 (69.7%) and 74,321 (72.2%) of the transcripts with predicted coding regions were annotated within the Dammit! pipeline when combining hits of all searches for the noble crayfish and the marbled crayfish, respectively . Annotation features include putative nucleotide and protein matches in the OrthoDB, Pfam, UniRef90, Rfam and reference Daphnia pulex proteome.
As an additional approach for functional annotation, transcripts were mapped to the reference canonical KEGG database. For the noble crayfish, 13,336 transcripts were mapped across 426 pathways and for marbled crayfish 17,309 transcripts were mapped across 425 pathways . Among the represented pathways, for both assemblies the highest number of transcripts was annotated to metabolic pathways, biosynthesis of secondary metabolites, microbial metabolism in diverse environments and pathways of neurodegeneration. Detailed methodological protocol is available .
Transcriptomic data allowed us to explore the gene expression landscape and identify key genes in the crayfish immunity. However, information about genomic locations and gene surroundings, which are highly influential on the gene expression profiles, are still not available. The quality of the transcriptomes could be improved by coupling these data with long-read sequencing data in future work to identify splice variants expressed during different experimental conditions. Furthermore, transcriptomic studies cannot address the real protein abundances, as changes in the gene expressions profiles are not always correlated to changes in the protein abundances.
Benchmarking sets of Universal Single-Copy Orthologs
Days post infection
Gene Expression Omnibus
- Hap A:
- Hap B:
Kyoto Encyclopedia of Genes an Genomes
National Center for Biotechnology Information
Open reading frames
Protein family databse
RNA family database
Single sequence repeats
UniProt Reference Clusters
Francesconi C, Makkonen J, Schrimpf A, Jussila J, Kokko H, Theissinger K. Controlled infection experiment with Aphanomyces astaci provides additional evidence for latent infections and resistance in freshwater crayfish. Front Ecol Evol. 2021;. https://doi.org/10.3389/fevo.2021.647037.
Boštjančić LL, Francesconi C, Rutz C, Hoffbeck L, Poidevin L, Kress A, et al. Host-pathogen coevolution drives innate immune response to Aphanomyces astaci infection in freshwater crayfish: transcriptomic evidence. BMC Genom. 2022;. https://doi.org/10.1186/s12864-022-08571-z.
Reynolds J, Souty-Grosset C, Richardson A. Ecological roles of crayfish in freshwater and terrestrial habitats. Freshw Crayfish. 2013;19:197–218.
Holdich DM, Reynolds JD, Souty-Grosset C, Sibley PJ. A review of the ever increasing threat to European crayfish from non-indigenous crayfish species. Knowl Manag Aquat Ecosyst. 2009. https://doi.org/10.1051/kmae/2009025.
Alderman DJ. Geographical spread of bacterial and fungal diseases of crustaceans. Rev Sci Tech l’OIE. 1996;15:603–32. https://doi.org/10.20506/rst.15.2.943.
Becking T, Mrugała A, Delaunay C, Svoboda J, Raimond M, Viljamaa-Dirks S, et al. Effect of experimental exposure to differently virulent Aphanomyces astaci strains on the immune response of the noble crayfish Astacus astacus. J Invertebr Pathol. 2015;132:115–24. https://doi.org/10.1016/j.jip.2015.08.007.
Keller NS, Pfeiffer M, Roessink I, Schulz R, Schrimpf A. First evidence of crayfish plague agent in populations of the marbled crayfish (Procambarus fallax forma virginalis). Knowl Manag Aquat Ecosyst. 2014. https://doi.org/10.1051/kmae/2014032.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. RNA-seq of Astacus astacus: adult hepatopancreas and RNA-seq of Procambarus virginalis: adult hepatopancreas 2022; NCBI Sequence Read Archive: https://identifiers.org/insdc.sra:SRP318523.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. Bostjancic_et_al_Data_set_2_Data_note. 2022; Figshare: https://doi.org/10.6084/m9.figshare.15029001.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. TSA: Astacus astacus, transcriptome shotgun assembly. 2022; NCBI TSA: https://identifiers.org/nucleotide:GJEB00000000.1.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. TSA: Procambarus virginalis, transcriptome shotgun assembly. 2022; NCBI TSA: https://identifiers.org/nucleotide:GJEC00000000.1.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. Bostjancic_et_al_Data_set_5_Data_note.tif 2022; Figshare: https://doi.org/10.6084/m9.figshare.15028644.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. Bostjancic_et_al_Data_set_6_Data_note.tif 2022; Figshare: https://doi.org/10.6084/m9.figshare.15031779.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. Bostjancic_et_al_Data_set_7_Data_note.tif. 2022; Figshare: https://doi.org/10.6084/m9.figshare.15031773.
Boštjančić LL Francesconi C Rutz C Hoffbeck L Poidevin L Kress A et al. Bostjancic_et_al_Data_set_8_Data_note.tif 2022; Figshare: https://doi.org/10.6084/m9.figshare.15031776.
We thank the BIGEst platform for informatics support.
The authors would like to express their gratitude to Dr. Clement Schneider and Alexandra Schmidt for their helpful suggestions. We would also like to acknowledge the support from Jorg Rapp in the server administration.
This work was supported by the IdEx Unistra in the framework of the “Investments for the future” program of the French government and Institute funds from the Centre National de la Recherche Scientifique and the Université de Strasbourg K.T. and M.B. received seed funding for RNA sequencing from the LOEWE center for Translational Biodiversity Genomics (TBG).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Boštjančić, L.L., Francesconi, C., Rutz, C. et al. Dataset of the de novo assembly and annotation of the marbled crayfish and the noble crayfish hepatopancreas transcriptomes. BMC Res Notes 15, 281 (2022). https://doi.org/10.1186/s13104-022-06137-6