Skip to main content
  • Research note
  • Open access
  • Published:

RNA sequencing of kidney and liver transcriptome obtained from wild cynomolgus macaque (Macaca fascicularis) originating from Peninsular Malaysia



Using high-throughput RNA sequencing technology, this study aimed to sequence the transcriptome of kidney and liver tissues harvested from Peninsular Malaysia cynomolgus macaque (Macaca fascicularis). M. fascicularis are significant nonhuman primate models in the biomedical field, owing to the macaque’s biological similarities with humans. The additional transcriptomic dataset will supplement the previously described Peninsular Malaysia M. fascicularis transcriptomes obtained in a past endeavour.


A total of 75,350,240 sequence reads were obtained via Hi-seq 2500 sequencing technology. A total of 5473 significant differentially expressed genes were called. Gene ontology functional categorisation showed that cellular process, catalytic activity, and cell part categories had the highest number of expressed genes, while the metabolic pathways category possessed the highest number of expressed genes in the KEGG pathway analysis. The additional sequence dataset will further enrich existing M. fascicularis transcriptome assemblies, and provide a dataset for further downstream studies.


Cynomolgus macaques (Macaca fascicularis) are nonhuman primate (NHP) models significant to biomedicine due to their close evolutionary relationship with humans. The cynomolgus macaque’s recapitulation of human physiology, genetics, and behaviour is advantageous as translational models for various studies in the biomedical field, including drug development and safety testing [1, 2]. Cynomolgus macaque individuals from different geographical locations have been shown to exhibit genetic characteristics that varies between geographical origins [3]. Therefore, it is vital that the genomes and transcriptomes of cynomolgus macaque NHP models originating from different geographical locations are sequenced as a reference for future biomedical research design and implementation.

This study describes the RNA sequencing (RNA-seq) of two tissues—kidney and liver—harvested from wild Peninsular Malaysian cynomolgus macaques, and is an extension of a previous study [4] whereby the lymph node, spleen, and thymus transcriptomes of wild Peninsular Malaysian M. fascicularis were also sequenced with RNA-seq technology. An additional 75,350,240 sequence reads were obtained from this study, supplementing the previous wild Peninsular Malaysian M. fascicularis dataset for downstream applications. Furthermore, additional Malaysian cynomolgus macaque RNA-seq datasets will further furnish the cynomolgus macaque genome and transcriptome annotations and also provide a valuable asset for biomedical studies involving cynomolgus macaques.

Main text

Materials and methods


Two male conflict macaques that appeared to belong in the same family group were captured by the Department of Wildlife and National Parks (DWNP) from the state of Selangor. Visual examination showed that the macaques were free of disease. Euthanisation and harvesting of organs were also performed by authorised and qualified veterinarians of DWNP. In brief, the macaques were sedated intramuscularly using general anaesthesia (combination of ketamine, 5–10 mg/kg and xylazine, 0.2–0.4 mg/kg) before a lethal dosage of Dolethal® were given intravenously. Harvested kidney and liver tissues were stored in RNAlater RNA Stabilization Agent (Qiagen). Total RNA extraction was carried out with RNeasy Mini Kit (Qiagen). A modified protocol was employed to remove genomic DNA contamination by incorporating Epicentre’s DNase I solution. RNA extract integrity was determined using 2100 Bioanalyzer (Agilent Technologies, Inc., United States of America) via RNA Pico chip. Samples with RIN > 7.0 were selected for library construction.

High-throughput sequencing, data analysis, and validation

Preparation of M. fascicularis kidney and liver RNA-seq libraries followed methods previously described in Ee Uli et al. [4]. Raw sequencing reads were submitted to NCBI Short Read Archive under accessions SRX2499144-SRX2499147. For data analyses, we used the same bioinformatic workflow and tools as depicted in our earlier publication [4]. Briefly, the assembly of raw sequence reads, normalisation, empirical analysis of differentially expressed genes (EDGE test), and gene ontology (GO) and KEGG pathway annotations were performed in CLC Genomics Workbench (CLCGW) version 7.5.1 (Qiagen Denmark), while functional classification of expressed genes in the kidney and liver transcriptomes was performed using Panther Database version 11.1 released on 2016-10-24 ( and DAVID Bioinformatics Resources version 6.8 ( To better present the tissue-specific genes, we combined gene expression data from this study and our previous study [4] to create a Venn diagram representation of tissue-specific genes using the web tool Venn Diagrams ( Validation of the RNA-seq dataset was performed with NanoString nCounter Elements XT (NanoString Technologies Inc., Seattle, WA, USA) gene expression assay. The methodology was the same as previously described in Ee Uli et al. [4]. Additional file 1 lists the genes utilised for the RNA-seq validation and their respective probe pair sequences.


Sequence reads trimming and RNA-seq mapping

The number of raw kidney and liver sequence reads obtained were 44,452,083 and 30,898,157 respectively, each read possessing a sequence length of 70 bp. After trimming, the number of kidney and liver sequence reads were 41,854,334 and 28,943,585 singly. Summing up to a total of 70,788,919 good quality reads retained for assembly, with the average lengths of the reads ranging from 58.8 to 60.1 bp. Post-trimming, 93.94% of the reads were retained and were considered suitable for RNA-seq mapping. The overall mapping percentage of the sequence reads for each RNA-seq library ranged from 65.00 to 68.59%. Additional file 2 summarises the statistics of the sequence reads trimming and mapping.

Normalisation and differential gene expression analysis

A total of 5473 significant differentially expressed genes were called. Additional file 3 lists the differentially expressed genes identified from the EDGE test with their respective annotated gene names and descriptions. Within the Kidney vs. Liver tissue comparison, the top three most upregulated genes in liver were PLA2G2A, SAA1, and ORM1, while the three most downregulated genes include UMOD, TINAG, and DHDH.

Tissue-specific genes

Using gene expression data obtained from our previous study [4] and this study, a Venn diagram representation of tissue-specific genes in kidney, liver, lymph node, spleen, and thymus tissues was generated (Fig. 1). Lymph node showed the highest number of tissue-specific genes (2156 genes) among the five tissues compared; while kidney and liver had 310 and 154 tissue-specific genes respectively. The three most highly expressed tissue-specific genes in the kidney include MIOX, AQP2, and CDH16, while the top three tissue-specific genes expressed in the liver include APCS, APOA2, and LOC102121341. A complete list of tissue-specific genes can be found in Additional file 4.

Fig. 1
figure 1

Venn diagram of the number of expressed genes (normalised expression value > 1) in their respective tissues

Functional annotation and classification

In the kidney, 14,001, 7046, and 5635 expressed genes were categorised to biological process, molecular function, and cellular component domains respectively, while 12,004, 6059, and 4935 expressed genes in the liver were categorised to BP, MF, and CC domains respectively. The distribution of the expressed genes into different level 2 GO categories are shown in Table 1. The BP category with the highest number of expressed genes for both tissues is cellular process (kidney, 4039; liver, 3496). For the MF domain, the catalytic activity category contains the highest number of expressed genes for both tissues (kidney, 2709; liver, 2419). As for CC, both tissues show the highest number of expressed genes in the cell part category (kidney, 2185; liver, 1954).

Table 1 Categorisation of expressed genes to gene ontology categories in kidney and liver tissues

In the kidney and liver, a total of 4406 and 3960 genes were assigned to KEGG pathway categories respectively. The metabolic pathways category contains the highest number of expressed genes for both kidney and liver tissues—903 and 839 genes respectively. The distribution of the expressed genes into different KEGG pathway categories are shown in Table 2.

Table 2 Categorisation of expressed genes to KEGG pathway categories in kidney and liver tissues

Validation of RNA-seq differential gene expression

Thirteen genes were selected to validate the RNA-seq differential gene expression dataset via NanoString gene expression assay. Normalised fold change values between the RNA-seq and NanoString platforms show concordance in magnitude directionality (Additional file 5). Our previous study also employed the NanoString gene expression assay to validate the RNA-seq dataset, and has shown similar concordance between the two platforms [4].


In our previous endeavour [4], the lymph node, spleen, and thymus transcriptomes of the Malaysian cynomolgus macaque were sequenced via Illumina Hi-seq 2500 technology. The dataset presented in this study provides additional transcriptomic sequences from the kidney and liver of the Malaysian cynomolgus macaque, with a sum of 75,350,240 sequence reads obtained from four tissue libraries. In Ee Uli et al. [4], the transcriptome sequences were derived from immune-related tissues, whereas the kidney and liver transcriptomes generated in this study were obtained from metabolism-related tissues, and to a certain extent, these tissues also pertain to immune functions as well. Cynomolgus macaques are utilised as NHP models for liver-expressed drug metabolising enzyme studies, for example, the pharmacokinetics of the cytochrome P450 family of enzymes have been extensively studied in cynomolgus macaques [2, 5, 6].

It was observed that the percentages of the processed sequence reads that mapped back to the reference genome for all four libraries were below 70%. To rule out sequence contamination, we performed a BLAST alignment of the unmapped reads to NCBI’s nucleotide (nt) database and classified the hits into their specific taxonomic classification. The majority of the unmapped reads were assigned to Cercopithecidae and other primate family taxa, while the remaining reads were unassigned or did not have any BLASTN hits (Additional file 6). This suggests the absence of contaminating sequences, and also suggests the presence of novel transcripts in the kidney and liver dataset of the Malaysian cynomolgus macaque, which are yet to be described and curated in the M. fascicularis reference genome.

Comparisons were made between the kidney and liver tissues due to the similarities in metabolic function between the two tissues. However, the specific metabolic functions of both kidney and liver tissues are ultimately different. For instance, the KEGG pathway enrichment analysis highlights 58 liver genes that were specifically enriched in the biosynthesis of amino acids pathway, while 55 kidney genes were involved in the Inositol phosphate metabolism pathway.

Based on the differential gene expression analysis of kidney and liver tissues, the top three most upregulated genes in liver were PLA2G2A, SAA1, and ORM1, while the three most downregulated genes in kidney include UMOD, TINAG, and DHDH. The PLA2G2A gene codes for phospholipase A2 group IIA, and is suggested to be involved in the metabolism of phospholipids, inflammatory responses, and antimicrobial defence [7]. SAA1 codes for the acute-phase serum amyloid A1 protein, and is involved in inflammation response [8]. ORM1 codes for orosomucoid 1, which is suggested to induce the conversion of monocytes to M2b macrophages in response to type 2 cytokines in cancer patients [9]. UMOD codes for uromodulin, a urinary protein which inhibits the crystallisation of renal fluids that causes renal stone formation, and also expedites transepithelial migration of neutrophils in inflammation responses [10]. Tubulointerstitial nephritis antigen is a glycoprotein coded by TINAG, which is a nephritogenic antigen implicated in tubular homeostasis and cell survival in the kidney [11]. DHDH codes for the enzyme dihydrodiol dehydrogenase, and is suggested to be involved in detoxification processes in the kidney and liver [12].

In the GO analysis, both kidney and liver transcriptomes were observed to have the majority of genes represented by the same categories in their respective GO domains. In the BP domain, the highly represented category was cellular process, while the category with the highest gene representation in the MF domain was catalytic activity. As for CC domain, the highest number of expressed genes was represented in the cell part category. The representation of functional GO categories in the M. fascicularis transcriptome were observed to be similar to that of other chordates, including that of the silver carp [13], pig [14], black grouse [15], and pufferfish [16], and suggest a fair representation of the M. fascicularis transcriptome with that of other chordates.

The Malaysian cynomolgus macaque, M. f. fascicularis, is a unique subspecies that is suggested to have high nucleotide diversity and is genetically unique compared to other South East Asian M. fascicularis populations [17,18,19]. Genetic heterogeneity is suggested to contribute to varied responses in cynomolgus macaques towards drugs and pathogens [20], making research into the Malaysian population of cynomolgus macaque invaluable to the biomedicine field. The transcriptome dataset of the Peninsular Malaysian cynomolgus macaque obtained from this study can be compared with cynomolgus macaques originating from other geographical locations including Mauritius, the Philippines, Indochina, Indonesia, and China. Such comparisons can be utilised for the detection of single nucleotide polymorphisms in the macaque transcriptome, to infer phylogeny utilising a larger set of genic sequences, and also to determine the nucleotide diversity of the Peninsular Malaysian cynomolgus macaque. Furthermore, the transcriptomic data obtained from this study may be utilised to design genic molecular markers for rapid assessment of wild populations of cynomolgus macaques in Peninsular Malaysia in a relatively shorter period of time and at a lower cost.


Via the Illumina Hi-seq 2500 platform, 75,350,240 sequence reads in total were obtained from kidney and liver transcriptomes of the wild Peninsular Malaysian cynomolgus macaque. Additional sequence reads from the Malaysian cynomolgus macaque individuals will potentially further supplement the present genomic and transcriptomic data of this significant NHP primate model.


  • Only two biological replicates were available for library construction and data analyses.

  • RNA sequencing of additional tissues will provide more robust transcriptomic dataset of wild Peninsular Malaysia M. fascicularis.



nonhuman primate


Department of Wildlife and National Parks


RNA sequencing


CLC Genomics Workbench


transcripts per million


empirical analysis of differentially expressed genes


gene ontology


biological process


molecular function


cellular component


  1. Shen H, Yang Z, Mintier G, Han YH, Chen C, Balimane P, et al. Cynomolgus monkey as a potential model to assess drug interactions involving hepatic organic anion transporting polypeptides: in vitro, in vivo, and in vitro-to-in vivo extrapolation. J Pharmacol Exp Ther. 2013.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Uno Y, Matsushita A, Shukuya M, Matsumoto Y, Murayama N, Yamazaki H. CYP2C19 polymorphisms account for inter-individual variability of drug metabolism in cynomolgus macaques. Biochem Pharmacol. 2014.

    Article  PubMed  Google Scholar 

  3. Berry NJ, Marzetta F, Towers GJ, Rose NJ. Diversity of TRIM5α and TRIMCyp sequences in cynomolgus macaques from different geographical origins. Immunogenetics. 2012.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Ee Uli J, Yong CSY, Yeap SK, Rovie-Ryan JJ, Isa NM, Tan SG, et al. RNA sequencing (RNA-Seq) of lymph node, spleen, and thymus transcriptome from wild Peninsular Malaysian cynomolgus macaque (Macaca fascicularis). PeerJ. 2017.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Iwasaki K, Uno Y. Cynomolgus monkey CYPs: a comparison with human CYPs. Xenobiotica. 2009;39:578–81.

    Article  CAS  Google Scholar 

  6. Emoto C, Yoda N, Uno Y, Iwasaki K, Umehara K, Kashiyama E, et al. Comparison of p450 enzymes between cynomolgus monkeys and humans: p450 identities, protein contents, kinetic parameters, and potential for inhibitory profiles. Curr Drug Metab. 2013;14:239–52.

    CAS  PubMed  Google Scholar 

  7. Ganesan K, Ivanova T, Wu Y, Rajasegaran V, Wu J, Lee MH, et al. Inhibition of gastric cancer invasion and metastasis by PLA2G2A, a novel β-catenin/TCF target gene. Cancer Res. 2008;68:4277–86.

    Article  CAS  Google Scholar 

  8. Sun L, Ye RD. Serum amyloid A1: structure, function and gene polymorphism. Gene. 2016;583:48–57.

    Article  CAS  Google Scholar 

  9. Nakamura K, Ito I, Kobayashi M, Herndon DN, Suzuki F. Orosomucoid 1 drives opportunistic infections through the polarization of monocytes to the M2b phenotype. Cytokine. 2015;73:8–15.

    Article  CAS  Google Scholar 

  10. Schmid M, Prajczer S, Gruber LN, Bertocchi C, Gandini R, Pfaller W, et al. Uromodulin facilitates neutrophil migration across renal epithelial monolayers. Cell Physiol Biochem. 2010;26:311–8.

    Article  CAS  Google Scholar 

  11. Xie P, Kondeti VK, Lin S, Haruna Y, Raparia K, Kanwar YS. Role of extracellular matrix renal tubulo-interstitial nephritis antigen (TINag) in cell survival utilizing integrin αvβ3/focal adhesion kinase (FAK)/phosphatidylinositol 3-kinase (PI3 K)/protein kinase Bserine/threonine kinase (AKT) signaling pathway. J Biol Chem. 2011;286:34131–46.

    Article  CAS  Google Scholar 

  12. Carbone V, Hara A, El-Kabbani O. Structural and functional features of dimeric dihydrodiol dehydrogenase. Cell Mol Life Sci. 2008;65:1464–74.

    Article  CAS  Google Scholar 

  13. Fu B, He S. Transcriptome analysis of silver carp (Hypophthalmichthys molitrix) by paired-end RNA sequencing. DNA Res. 2012;19:131–42.

    Article  CAS  Google Scholar 

  14. Nie Q, Fang M, Jia X, Zhang W, Zhou X, He X, et al. Analysis of muscle and ovary transcriptome of Sus scrofa: assembly, annotation and marker discovery. DNA Res. 2011;18:343–51.

    Article  CAS  Google Scholar 

  15. Wang B, Ekblom R, Castoe TA, Jones EP, Kozma R, Bongcam-Rudloff E, et al. Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development. Open Biol. 2012.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Cui J, Liu S, Zhang B, Wang H, Sun H, Song S, et al. Transcriptome analysis of the gill and swimbladder of Takifugu rubripes by RNA-Seq. PLoS ONE. 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Abdul-Latiff MAB, Ruslin F, Faiq H, Hairul MS, Rovie-Ryan JJ, Abdul-Patah P, et al. Continental monophyly and molecular divergence of Peninsular Malaysia’s Macaca fascicularis fascicularis. Biomed Res Int. 2014;2014:1–18.

    Article  Google Scholar 

  18. Osada N, Hettiarachchi N, Babarinde IA, Saitou N, Blancher A. Whole-genome sequencing of six Mauritian cynomolgus macaques (Macaca fascicularis) reveals a genome-wide pattern of polymorphisms under extreme population bottleneck. Genome Biol Evol. 2015;7:821–30.

    Article  CAS  Google Scholar 

  19. Liedigk R, Kolleck J, Böker KO, Meijaard E, Md-Zain BM, Abdul-Latiff MAB, et al. Mitogenomic phylogeny of the common long-tailed macaque (Macaca fascicularis fascicularis). BMC Genomics. 2015.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Menninger K, Wieczorek G, Riesen S, Kunkler A, Audet M, Blancher A, et al. The origin of cynomolgus monkey affects the outcome of kidney allografts under Neoral immunosuppression. Transplant Proc. 2002;34:2887–8.

    Article  CAS  Google Scholar 

Download references

Authors’ contributions

JEU, CS-YY, SKY, NBA, and SGT conceptualised and designed the study. JJR-R provided cynomolgus macaque samples. JEU performed the lab work. JEU, CS-YY, SKY, and NMI analysed the dataset. Manuscript was prepared by JEU. All authors read and approved the final manuscript.


The authors would like to thank the Department of Wildlife and National Parks, Malaysia, for their permission to conduct this study. Additional thanks goes out to Dr. Ng Wei Lun (School of Life Sciences, Sun Yat-sen University), Nancy Liew Woan Charn (Laboratory of Vaccines and Immunotherapeutics, Institute of Bioscience, Universiti Putra Malaysia), and Mok Shao Feng for their invaluable guidance and assistance throughout the preparation of this manuscript.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

Raw sequencing reads were submitted to NCBI Short Read Archive database under accessions SRX2499144-SRX2499147.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Permission to obtain kidney and liver tissue samples of macaque individuals for research purposes by the Genetics Lab at the Department of Biology was granted by the Department of Wildlife and National Parks [Reference Number: JPHL&TN(IP): 80-4/2]. Sampling and euthanasia of conflict M. fascicularis were performed by the Department of Wildlife and National Parks according to the guidelines set by the Institutional Animal Care and Use Committee (IACUC), University of California, Davis, United States of America as adopted by the PREDICT Project in Malaysia.


This research was supported by Research University Grant Scheme (Project No. GPIPB/2013/9413602) provided by Universiti Putra Malaysia, Malaysia. The funder had no role in study design, conducting the study, decision to publish, and preparation of the manuscript.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Christina Seok-Yien Yong.

Additional files

Additional file 1.

Selected genes for NanoString nCounter XT Gene Expression assay validation of RNA-Seq gene expression profiles. Table lists selected genes utilised for the validation of RNA-Seq gene expression profiles via NanoString nCounter XT Gene Expression assay and their respective probes nucleotide sequences.

Additional file 2.

Macaca fascicularis kidney and liver sequence reads trimming and reference guided assembly summary statistics. Summary of Macaca fascicularis kidney and liver sequence reads quality trimming and reference guided assembly.

Additional file 3.

List of differentially expressed genes from kidney vs. liver comparison. Differentially expressed genes in kidney and liver tissues called from the empirical differential gene expression analysis in CLC Genomics Workbench. Listed are the genes with their corresponding fold change values, together with their gene ontology and pathway annotations.

Additional file 4.

Tissue-specific genes (sorted from highest to lowest Normalised mean expression value). A list of tissue-specific genes generated from a Venn diagram comparison of expressed genes (normalised expression value > 1) in kidney, liver, lymph node, spleen, and thymus tissues.

Additional file 5.

Validation of RNA-seq differential gene expression results. Log2 transformed fold change values were obtained from RNA-seq and NanoString nCounter XT platforms. Black bar represents fold change values obtained from RNA-Seq platform, while white bar represents fold change value obtained from NanoString nCounter XT platform.

Additional file 6.

BLAST alignment of unmapped Macaca fascicularis reads to NCBI nucleotide database. Graphical representation of unmapped M. fascicularis sequence reads assigned to taxonomic ranks based on BLAST alignment with NCBI nucleotide (nt) database. Intensity of the colour green represents the number of unmapped reads mapped to a respective taxonomic rank—the more intense the green, the higher the number of reads assigned to a particular taxon.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ee Uli, J., Yong, C.SY., Yeap, S.K. et al. RNA sequencing of kidney and liver transcriptome obtained from wild cynomolgus macaque (Macaca fascicularis) originating from Peninsular Malaysia. BMC Res Notes 11, 923 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: