The mixed liver and kidney transcriptome dataset of Darevskia valentini rock lizard
BMC Research Notes volume 15, Article number: 345 (2022)
This study is performed in the frame of a bigger study dedicated to genomics and transcriptomics of parthenogenesis in vertebrates. Among vertebrates, obligate parthenogenesis was first described in the lizards of the genus Darevskia. In this genus, all found parthenogenetic species originated via interspecific hybridization. It remains unknown which genetic or genomic factors play a key role in the generation of parthenogenetic organisms. Comparative genomic and transcriptomic analysis of parthenogens and their parental species may elucidate this problem. Darevskia valentini is a paternal species for four (of seven) parthenogens of this genus, which we promote as a particularly important species for the generation of parthenogenetic forms.
Total cellular RNA was isolated from kidney and liver tissues using the standard Trizol Tissue RNA Extraction protocol. Sequencing of transcriptome libraries prepared by random fragmentation of cDNA samples was performed on an Illumina HiSeq2500. Obtained raw sequences contained 117,6 million reads with the GC content of 47%. After preprocessing, raw data was assembled by Trinity and produced 491,482 contigs.
Hybrid speciation can be considered one of the main variants of reticulate evolution . In some cases, this phenomenon results in the formation of clonal lineages and parthenogenetic species. This study is performed in the frame of a bigger study dedicated to genomics and transcriptomics of parthenogenesis in vertebrates. Till now we carried out for the first time whole-genome sequencing and assembly of trio lizard species, parthenogenetic Darevskia unisexualis, and its parental species D. valentini and D. raddei. However, these data were not published because genome annotations were not yet done. Among vertebrates, obligate parthenogenesis was first described in the rock lizards of the genus Darevskia , which include 29 bisexual and seven unisexual (parthenogenetic) species, distributed in the Caucasus region, Turkey, and Iran . In this genus, as in most known instances, all found parthenogenetic species originated via interspecific hybridization between closely related bisexual species . Distinctive features of the Darevskia rock lizards are the high diversity of parthenogens (seven diploid forms) and ongoing hybridization events in sympatry zones of sexual and parthenogenetic species resulting in triploid and tetraploid hybrids which are considered an intermediate stage of reticulate evolution . The origin of Darevskia parthenogens is phylogenetically constrained . Only four parental bisexual species are involved in the origin of seven parthenogens: D. valentini and D. portschinskii as the paternal species and D. raddei and D. mixta as the maternal species [6, 7]. It remains unknown, which genetic or genomic factors play a key role in the generation of clonally reproduced parthenogenetic organisms. Comparative genomic and transcriptomic analysis of parthenogens and their parental species may elucidate this problem. In particular, Darevskia valentini is a paternal species for four (of seven) Darevskia parthenogens, that we promote as a particularly important species for the generation of parthenogenetic forms.
Samples of D. valentini for transcriptome analysis were collected in Armenia in 2016, outside of the protected areas. All individuals were hand-caught. A single adult lizard of male D. valentini from the gorge population near the Sepasar village (41°01’39.2"N, 43°48’58.0"E) was used to surgically extract organs (liver, kidneys). Before dissecting the organs, the animals were subjected to chloroform euthanasia followed by decapitation. All tissue samples were stored in RNAlater® reagent at − 20 °C according to the manufacturer’s recommended protocol (Qiagen Inc.) until they were shipped to Macrogen Inc. (Korea) for RNA extraction and further transcriptome sequencing.
Total RNA was isolated from an organs/tissues using standard Trizol Tissue RNA Extraction protocol and was used to prepare the cDNA library The paired-end sequencing libraries were prepared by random fragmentation of the cDNA samples into 350–500 bp fragments, followed by 5’ and 3’ adapter ligation using TruSeq RNA Sample Prep Kit v2 (Illumina Inc.) according to TruSeq RNA Sample Preparation Guide (Version 2, Part #15,026,495 Rev.F). Sequencing of transcriptome libraries was performed on Illumina HiSeq2500 with a mean read length of 101 bp. The Illumina Hiseq generated raw sequencing data utilizing HiSeq Control Software v2.2 for system control and base calling through an integrated primary analysis software. The BCL (base calls) binaries were converted into FASTQ format by the Illumina package bcl2fastq v1.8.4  (RRID:SCR_015058). Raw transcriptome data were trimmed by Trimmomatic v0.39 to remove adapters and deduplicated by the rmdup tool [9, 10] (Data set 1) . Filtered reads quality was assembled using Trinity v2.1.1  with the default minimum contig length value and k-mer size parameters of 200 and 25, respectively. Summary statistics of raw samples, reads, and assembly can be accessed in Data file 1 . The assembly contained 491,482 contigs with a median contig length of 923 bp (Data file 2) .
The annotation was provided using TransPi v1.1.0-rc pipeline  with OnlyAnn (only annotation) mode . This option included such instruments as TransDecoder, and Trinotate. The TransDecoder program was used to predict translated proteins (Data file 3) . EggNog v2.0.1  was used to cross protein sequences with the Gene Ontology database. BLASTp, PFAM, and EggNOG searching tools revealed 26,812, 6496, and 15,399 proteins respectively (Data file 4) . The most significant Gene Ontology terms were identified and visualized by Trinotate (Data file 5) . In cellular components ontology, the nucleus and cytoplasm were dominated. The regulation of transcription of RNA polymerase II was the most over-represented category in biological processes. In molecular functions, the prevailed number of enriched genes was related to the metal ion and ATP binding. The data of top blasted species and full statistics of GO, ORF prediction numbers, and Trinotate full annotation was also performed (Data file 6) .
While our transcriptome data can be used for annotation or verification of protein-coding genes in the lizard genome of D. valentini and related lizard species, some limitations are connected with a restricted number of tissues (only liver and kidney) taken for generation of the mixed transcriptome.
complementary deoxyribonucleic acid
binary base calls
open reading frame
Dobzhansky T. Genetics and the origin of species. New York: Columbia Univ. Press; 1937.
Darevskii IS. Rock lizards of the Caucasus: systematics, ecology, and phylogenesis of the polymorphic groups of Caucasian rock lizards of the subgenus Archaeolacerta. Nauka. 1967;:1–216.
Uetz P, Freed P, Hošek J, et al. THE REPTILE DATABASE. http://www.reptile-database.org/. Accessed 3 Apr 2021.
Neaves WB, Baumann P. Unisexual reproduction among vertebrates. Trends Genet. 2011;27:81–8. doi:https://doi.org/10.1016/j.tig.2010.12.002.
Danielayn F, Arakelyan M, Stepanyan I. The progress of microevolution in hybrids of rock lizards of genus Darevskia. Biol J Armen. 2008;60:147–56.
Murphy RW, Fu J, Macculloch RD, Darevsky IS, Kupriyanova LA. A fine line between sex and unisexuality: The phylogenetic constraints on parthenogenesis in lacertid lizards. Zool J Linn Soc. 2000;130:527–49.
Fu J, Murphy RW, Darevsky IS. Toward the phylogeny of caucasian rock lizards: implications from mitochondrial DNA gene sequences (Reptilia: Lacertidae). Zool J Linn Soc. 1997;120:463–77. doi:https://doi.org/10.1111/j.1096-3642.1997.tb01283.x.
bcl2fastq Conversion Software. https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html. Accessed 11 Apr 2022.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. doi:https://doi.org/10.1093/bioinformatics/btu170.
aglabx/rmdup. Removes optical duplicates from raw Illumina sequence reads, GitHub. (n.d.). https://github.com/aglabx/rmdup. Accessed 11 Apr 2022.
RNA-seq of. D.valentini: adult male mixed liver and kidneys. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX14421363 (2022).
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52. doi:https://doi.org/10.1038/nbt.1883.
Ryakhovsky S. Summary of raw RNA and assembly characteristics. figshare. 2022. https://doi.org/10.6084/m9.figshare.17762030.
Ryakhovsky S. De novo assembly by Trinity. NCBI Transcriptome Shotgun Assembly Sequence Database https://identifiers.org/nucleotide:GJZU00000000.1 (2022).
Rivera-Vicéns RE, Garcia-Escudero CA, Conci N, Eitel M, Wörheide G. TransPi – a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly. bioRxiv. 2021;:2021.02.18.431773. doi:https://doi.org/10.1101/2021.02.18.431773.
Ryakhovsky SS, Dikaya VA, Korchagin VI, Vergun AA, Danilov LG, Ochkalova SD, et al. De novo transcriptome assembly and annotation of parthenogenetic lizard Darevskia unisexualis and its parental ancestors Darevskia valentini and Darevskia raddei nairensis. Data Br. 2021;39:107685.
Ryakhovsky S, TransDecoder. peptides. figshare https://doi.org/10.6084/m9.figshare.17696930 (2022).
Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol Biol Evol. 2021;38:5825–9. doi:https://doi.org/10.1093/MOLBEV/MSAB293.
Ryakhovsky S, BLASTp PFAM, EggNOG. proteins. figshare https://doi.org/10.6084/m9.figshare.17696915.v2 (2022).
Ryakhovsky S, Top GO. terms. figshare https://doi.org/10.6084/m9.figshare.17696939 (2022).
Ryakhovsky S. Summary of Trinotate, TransDecoder and top blasted species. figshare https://doi.org/10.6084/m9.figshare.17696951 (2022).
RNA characterization experiments were performed using the Center for Precision Genome Editing and Genetic Technologies for Biomedicine, IGB RAS.
This research was funded by the Russian Science Foundation (RSF) Research Project № 19-14-00083.
Ethics approval and consent to participate
The study was approved by the Ethics Committee of the Moscow State University (Permit Number: 24–01) and conducted strictly according to ethical principles and scientific standards. Alive-animal handling procedures were approved by Yerevan State University according to the ethical guidelines, capture permit Code 5/22.1/51043 was issued by the Ministry of Nature Protection of the Republic of Armenia for scientific studies.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ryakhovsky, S.S., Zhernakova, D.V., Korchagin, V.I. et al. The mixed liver and kidney transcriptome dataset of Darevskia valentini rock lizard. BMC Res Notes 15, 345 (2022). https://doi.org/10.1186/s13104-022-06228-4