Skip to main content

The mixed liver and kidney transcriptome dataset of Darevskia valentini rock lizard



This study is performed in the frame of a bigger study dedicated to genomics and transcriptomics of parthenogenesis in vertebrates. Among vertebrates, obligate parthenogenesis was first described in the lizards of the genus Darevskia. In this genus, all found parthenogenetic species originated via interspecific hybridization. It remains unknown which genetic or genomic factors play a key role in the generation of parthenogenetic organisms. Comparative genomic and transcriptomic analysis of parthenogens and their parental species may elucidate this problem. Darevskia valentini is a paternal species for four (of seven) parthenogens of this genus, which we promote as a particularly important species for the generation of parthenogenetic forms.

Data description

Total cellular RNA was isolated from kidney and liver tissues using the standard Trizol Tissue RNA Extraction protocol. Sequencing of transcriptome libraries prepared by random fragmentation of cDNA samples was performed on an Illumina HiSeq2500. Obtained raw sequences contained 117,6 million reads with the GC content of 47%. After preprocessing, raw data was assembled by Trinity and produced 491,482 contigs.


Hybrid speciation can be considered one of the main variants of reticulate evolution [1]. In some cases, this phenomenon results in the formation of clonal lineages and parthenogenetic species. This study is performed in the frame of a bigger study dedicated to genomics and transcriptomics of parthenogenesis in vertebrates. Till now we carried out for the first time whole-genome sequencing and assembly of trio lizard species, parthenogenetic Darevskia unisexualis, and its parental species D. valentini and D. raddei. However, these data were not published because genome annotations were not yet done. Among vertebrates, obligate parthenogenesis was first described in the rock lizards of the genus Darevskia [2], which include 29 bisexual and seven unisexual (parthenogenetic) species, distributed in the Caucasus region, Turkey, and Iran [3]. In this genus, as in most known instances, all found parthenogenetic species originated via interspecific hybridization between closely related bisexual species [4]. Distinctive features of the Darevskia rock lizards are the high diversity of parthenogens (seven diploid forms) and ongoing hybridization events in sympatry zones of sexual and parthenogenetic species resulting in triploid and tetraploid hybrids which are considered an intermediate stage of reticulate evolution [5]. The origin of Darevskia parthenogens is phylogenetically constrained [6]. Only four parental bisexual species are involved in the origin of seven parthenogens: D. valentini and D. portschinskii as the paternal species and D. raddei and D. mixta as the maternal species [6, 7]. It remains unknown, which genetic or genomic factors play a key role in the generation of clonally reproduced parthenogenetic organisms. Comparative genomic and transcriptomic analysis of parthenogens and their parental species may elucidate this problem. In particular, Darevskia valentini is a paternal species for four (of seven) Darevskia parthenogens, that we promote as a particularly important species for the generation of parthenogenetic forms.

Data description

Samples of D. valentini for transcriptome analysis were collected in Armenia in 2016, outside of the protected areas. All individuals were hand-caught. A single adult lizard of male D. valentini from the gorge population near the Sepasar village (41°01’39.2"N, 43°48’58.0"E) was used to surgically extract organs (liver, kidneys). Before dissecting the organs, the animals were subjected to chloroform euthanasia followed by decapitation. All tissue samples were stored in RNAlater® reagent at − 20 °C according to the manufacturer’s recommended protocol (Qiagen Inc.) until they were shipped to Macrogen Inc. (Korea) for RNA extraction and further transcriptome sequencing.

Table 1 Overview of data files/datasets

Total RNA was isolated from an organs/tissues using standard Trizol Tissue RNA Extraction protocol and was used to prepare the cDNA library The paired-end sequencing libraries were prepared by random fragmentation of the cDNA samples into 350–500 bp fragments, followed by 5’ and 3’ adapter ligation using TruSeq RNA Sample Prep Kit v2 (Illumina Inc.) according to TruSeq RNA Sample Preparation Guide (Version 2, Part #15,026,495 Rev.F). Sequencing of transcriptome libraries was performed on Illumina HiSeq2500 with a mean read length of 101 bp. The Illumina Hiseq generated raw sequencing data utilizing HiSeq Control Software v2.2 for system control and base calling through an integrated primary analysis software. The BCL (base calls) binaries were converted into FASTQ format by the Illumina package bcl2fastq v1.8.4 [8] (RRID:SCR_015058). Raw transcriptome data were trimmed by Trimmomatic v0.39 to remove adapters and deduplicated by the rmdup tool [9, 10] (Data set 1) [11]. Filtered reads quality was assembled using Trinity v2.1.1 [12] with the default minimum contig length value and k-mer size parameters of 200 and 25, respectively. Summary statistics of raw samples, reads, and assembly can be accessed in Data file 1 [13]. The assembly contained 491,482 contigs with a median contig length of 923 bp (Data file 2) [14].

The annotation was provided using TransPi v1.1.0-rc pipeline [15] with OnlyAnn (only annotation) mode [16]. This option included such instruments as TransDecoder, and Trinotate. The TransDecoder program was used to predict translated proteins (Data file 3) [17]. EggNog v2.0.1 [18] was used to cross protein sequences with the Gene Ontology database. BLASTp, PFAM, and EggNOG searching tools revealed 26,812, 6496, and 15,399 proteins respectively (Data file 4) [19]. The most significant Gene Ontology terms were identified and visualized by Trinotate (Data file 5) [20]. In cellular components ontology, the nucleus and cytoplasm were dominated. The regulation of transcription of RNA polymerase II was the most over-represented category in biological processes. In molecular functions, the prevailed number of enriched genes was related to the metal ion and ATP binding. The data of top blasted species and full statistics of GO, ORF prediction numbers, and Trinotate full annotation was also performed (Data file 6) [21].


While our transcriptome data can be used for annotation or verification of protein-coding genes in the lizard genome of D. valentini and related lizard species, some limitations are connected with a restricted number of tissues (only liver and kidney) taken for generation of the mixed transcriptome.

Availability of data and materials

The raw data described in this Data note can be freely and openly accessed on the NCBI SRA database under accession ID SRX14421363. Please see Table 1 for details and links to the rest of the data [11, 13, 14, 17, 19,20,21].



complementary deoxyribonucleic acid


ribonucleic acid


binary base calls


base pair


Gene Ontology


open reading frame


adenosine triphosphate


  1. Dobzhansky T. Genetics and the origin of species. New York: Columbia Univ. Press; 1937.

    Google Scholar 

  2. Darevskii IS. Rock lizards of the Caucasus: systematics, ecology, and phylogenesis of the polymorphic groups of Caucasian rock lizards of the subgenus Archaeolacerta. Nauka. 1967;:1–216.

  3. Uetz P, Freed P, Hošek J, et al. THE REPTILE DATABASE. Accessed 3 Apr 2021.

  4. Neaves WB, Baumann P. Unisexual reproduction among vertebrates. Trends Genet. 2011;27:81–8. doi:

    Article  CAS  PubMed  Google Scholar 

  5. Danielayn F, Arakelyan M, Stepanyan I. The progress of microevolution in hybrids of rock lizards of genus Darevskia. Biol J Armen. 2008;60:147–56.

    Google Scholar 

  6. Murphy RW, Fu J, Macculloch RD, Darevsky IS, Kupriyanova LA. A fine line between sex and unisexuality: The phylogenetic constraints on parthenogenesis in lacertid lizards. Zool J Linn Soc. 2000;130:527–49.

    Article  Google Scholar 

  7. Fu J, Murphy RW, Darevsky IS. Toward the phylogeny of caucasian rock lizards: implications from mitochondrial DNA gene sequences (Reptilia: Lacertidae). Zool J Linn Soc. 1997;120:463–77. doi:

    Article  Google Scholar 

  8. bcl2fastq Conversion Software. Accessed 11 Apr 2022.

  9. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. doi:

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. aglabx/rmdup. Removes optical duplicates from raw Illumina sequence reads, GitHub. (n.d.). Accessed 11 Apr 2022.

  11. RNA-seq of. D.valentini: adult male mixed liver and kidneys. NCBI Sequence Read Archive (2022).

  12. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52. doi:

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Ryakhovsky S. Summary of raw RNA and assembly characteristics. figshare. 2022.

  14. Ryakhovsky S. De novo assembly by Trinity. NCBI Transcriptome Shotgun Assembly Sequence Database (2022).

  15. Rivera-Vicéns RE, Garcia-Escudero CA, Conci N, Eitel M, Wörheide G. TransPi – a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly. bioRxiv. 2021;:2021.02.18.431773. doi:

  16. Ryakhovsky SS, Dikaya VA, Korchagin VI, Vergun AA, Danilov LG, Ochkalova SD, et al. De novo transcriptome assembly and annotation of parthenogenetic lizard Darevskia unisexualis and its parental ancestors Darevskia valentini and Darevskia raddei nairensis. Data Br. 2021;39:107685.

    Article  CAS  Google Scholar 

  17. Ryakhovsky S, TransDecoder. peptides. figshare (2022).

  18. Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol Biol Evol. 2021;38:5825–9. doi:

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ryakhovsky S, BLASTp PFAM, EggNOG. proteins. figshare (2022).

  20. Ryakhovsky S, Top GO. terms. figshare (2022).

  21. Ryakhovsky S. Summary of Trinotate, TransDecoder and top blasted species. figshare (2022).

Download references


RNA characterization experiments were performed using the Center for Precision Genome Editing and Genetic Technologies for Biomedicine, IGB RAS.


This research was funded by the Russian Science Foundation (RSF) Research Project № 19-14-00083.

Author information

Authors and Affiliations



SR, DZ, VD performed the assembly, analysis, and interpretation of the raw sequenced data. VK, AG, AV designed the sampling methods. SR, AV, AR wrote the manuscript. MA collected the samples. AR, AK designed the study. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Sergei S. Ryakhovsky.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the Ethics Committee of the Moscow State University (Permit Number: 24–01) and conducted strictly according to ethical principles and scientific standards. Alive-animal handling procedures were approved by Yerevan State University according to the ethical guidelines, capture permit Code 5/22.1/51043 was issued by the Ministry of Nature Protection of the Republic of Armenia for scientific studies.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ryakhovsky, S.S., Zhernakova, D.V., Korchagin, V.I. et al. The mixed liver and kidney transcriptome dataset of Darevskia valentini rock lizard. BMC Res Notes 15, 345 (2022).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI:


  • Caucasian rock lizards
  • genus Darevskia
  • interspecific hybridization
  • parthenogenesis
  • Darevskia valentini
  • transcriptome assembly