Comparative analysis of miRNAs and their targets across four plant species
© Walther et al; licensee BioMed Central Ltd. 2010
Received: 5 July 2011
Accepted: 8 November 2011
Published: 8 November 2011
Skip to main content
© Walther et al; licensee BioMed Central Ltd. 2010
Received: 5 July 2011
Accepted: 8 November 2011
Published: 8 November 2011
MicroRNA (miRNA) mediated regulation of gene expression has been recognized as a major posttranscriptional regulatory mechanism also in plants. We performed a comparative analysis of miRNAs and their respective gene targets across four plant species: Arabidopsis thaliana (Ath), Medicago truncatula(Mtr), Brassica napus (Bna), and Chlamydomonas reinhardtii (Cre).
miRNAs were obtained from mirBase with 218 miRNAs for Ath, 375 for Mtr, 46 for Bna, and 73 for Cre, annotated for each species respectively. miRNA targets were obtained from available database annotations, bioinformatic predictions using RNAhybrid as well as predicted from an analysis of mRNA degradation products (degradome sequencing) aimed at identifying miRNA cleavage products. On average, and considering both experimental and bioinformatic predictions together, every miRNA was associated with about 46 unique gene transcripts with considerably variation across species. We observed a positive and linear correlation between the number miRNAs and the total number of transcripts across different plant species suggesting that the repertoire of miRNAs correlates with the size of the transcriptome of an organism. Conserved miRNA-target pairs were found to be associated with developmental processes and transcriptional regulation, while species-specific (in particular, Ath) pairs are involved in signal transduction and response to stress processes. Conserved miRNAs have more targets and higher expression values than non-conserved miRNAs. We found evidence for a conservation of not only the sequence of miRNAs, but their expression levels as well.
Our results support the notion of a high birth and death rate of miRNAs and that miRNAs serve many species specific functions, while conserved miRNA are related mainly to developmental processes and transcriptional regulation with conservation operating at both the sequence and expression level.
The discovery of microRNAs (miRNAs) in different kingdoms and in many species prompted comparative analyses to identify those miRNAs that are more strongly conserved than others and to understand whether their main functional role is associated with species-specific or universally occurring processes or both. Animal miRNAs have been reported to be involved in developmental timing, cell death, cell proliferation, haematopoiesis, and patterning of the nervous system , i.e. primarily developmental processes. MicroRNAs involved in these processes were also found to be conserved across species . Genes involved in basic cellular maintenance functions are less often miRNA targets . Many miRNA families have also been conserved across different plant lineages including mosses, gymnosperms, moncotes, and dicots [4–6]. However, with modern sequencing technologies that allow miRNAs do be identified at increased breadths, it was also noted that the number of species-specific miRNAs is greater than the number of conserved miRNAs [7, 8]. Thus, a high birth and death rate of miRNAs has been postulated . The highly dynamic nature of miRNA evolution was also confirmed recently in a comparative analysis of the closely related Arabidopsis species A.thaliana and A.lyrata . A substantial number of miRNAs was found to be species-specific, despite the only recent separation of the two species. No event of miRNA conservation between plants and animals has yet been found . As also the miRNA processing and targeting differs substantially, it has been concluded that the miRNA mechanism has evolved separately in animals and plants from common ancestral siRNA machinery .
In this study, we carried out a comparative analysis of miRNAs and their targets across the four plant species Arabidopsis thaliana (Ath), Brassica napus(Bna) - both members of the brassicaceae family, Medicago truncatula (Mtr) - a legume, and Chlamydomonas reinhardtii (Cre) - a single cell alga. The choice of plant species was motivated by several research projects conducted at the Max Planck Institute for Molecular Plant Physiology. The unifying goal of these studies was to identify functional miRNAs, to profile known miRNAs with regard to their abundance and to potentially discover novel miRNAs by applying the Solexa/Illumina Next Generation Sequencing (NGS) technology to RNA extractions for the different plant systems exposed to different conditions. More specifics and results regarding these studies can be found in  and , and in the Method section. Recently, so-called degradome sequencing was established as a powerful experimental approach to detect miRNA targets  and corresponding bioinformatic data processing pipelines introduced . Here, the cleavage products generated upon miRNA induced mRNA target cleavage are specifically identified thereby allowing those miRNA-target pairs to be identified for which cleavage is the mode of action while not detecting those targets that are under translational repression. Degradome data have also been used in the current study.
For the four investigated plant species, we obtained mature miRNA sequences and stem-loop sequences associated with miRNA precursors from miRBase release 15 http://www.mirbase.org  yielding 218 miRNAs for Ath, 375 for Mtr, 46 for Bna, and 73 for Cre, respectively. No miRNA-star sequences were considered for analysis. For Ath, cDNA sequence information was obtained from The Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org), genome release 9 . For Bna, assembled contigs were retrieved from PlantGDB http://www.plantgdb.org/. For Mtr, sequence and annotation information was obtained from The Medicago Genome Sequence Consortium (MGSC, http://www.medicago.org), and here referred to as "Mt3.0". Sequences for Cre were downloaded from the DOE Joint Genome Institute using genome assembly v4.0 and Augustus v5.0 gene models http://genome.jgi-psf.org.
We used smallRNA sequencing data obtained and published for Ath under eight [11, 17] and for Mtr under two different experimental conditions . The specific conditions were for Mtr: a) treatment with the symbiotic fungus mycorrhiza ("Myc") and b) treatment without the fungus ("N-Myc"). The eight conditions for Ath were: full nutrition ("FN"), phosphate starvation ("P"), phosphate starvation after three hours phosphate re-addition ("P+3 h"), nitrogen starvation ("N"), nitrogen starvation after three hours nitrogen re-addition ("N+3 h") (all from ), and FN from root cells ("root+p"), phosphate starvation from root cells ("root-p"), and phosphate starvation from shoot cells ("shoot-p") (from ). In total, 15.8 Mill small RNA reads were sequenced for Ath and 13.6 Mill reads for Mtr (2 conditions ("Myc", "N-Myc"). Degradome data to experimentally identify miRNA targets by detecting miRNA induced cleavage products from four conditions in Ath ("FN", "P-12 h", "P-48 h" and "N-48 h") and two conditions in Mtr ("Myc", "N-Myc") were used. For experimental details see [11, 12].
Normalization of expression values per condition was done to adjust for variable sequencing depth between samples. The sequencing reads mapping on annotated miRNA were normalized to reads per million (RPM) per experimental condition: number of reads per gene/number of total reads * 1E6.
For analyzing the conservation of miRNA families across species, we performed a pairwise global sequence alignment of all single mature miRNA sequences with the program Align0 . Sequence pairs were considered conserved if the sequence identity was greater than 75, if there was a perfect match of seed sequence (6 nt, positions 2-7), and the two respective identifiers of the pair were classified by miRBase to be in the same MIRNA family.
Verified miRNA-target relationships were extracted from several sources: Supplementary Data of  (500 targets), 530 targets in total from the Arabidopsis Small RNA Project, "ASRP" (http://asrp.cgrb.oregonstate.edu, ), experimental data reported in Supplementary Table two and three of , referred to here as "degradomeG" (60 targets), and degradome sequencing data for Ath and Mtr from in-house experiments, called "degradome" (1,154 targets) [11, 12]. To identify miRNA-target relationships from degradome data, the CleaveLand algorithm was used [14, 21]. miRNA-targets were further predicted using the program RNAhybrid . The mature miRNA sequence data from miRBase and, on the potential target side, the downloaded cDNAs or assembled ESTs mentioned above were used as input. We used the parameter settings described in . We required the minimum free energy of hybridization to be greater than 70% compared to perfect match hybridization; i.e. in concordance with the initial threshold used in . Note that for the final set, the authors in  used a stricter 75% mfe cutoff. In Arabidopsis and using a 75% mfe threshold level, we obtained 2,967 unique targets transcripts for 218 miRNAs. All RNAhybrid predictions with additional score and mfe information for all four plant species are provided in tabular format as supplementary material (Additional File 1).
Gene Ontology (GO) annotation files were downloaded: for Ath from TAIR , for Cre from the DOE Joint Genome Institute http://genome.jgi-psf.org/Chlre4/Chlre4.download.ftp.html, and for Mtr from http://www.medicago.org/genome/downloads/Mt3/. Annotations for Bna were assigned by copying the GO slim term from TAIR for the best hit from a BLAST run against Ath. The calculation of over-representation of GO terms was done by applying the Fisher's Exact Test for count data and the p-values for Molecular Functions and Biological Processes were adjusted for multiple testing applying the Benjamini-Hochberg method .
Genome size (Mbp)
average targets per
As reported for plant miRNA target action earlier , most miRNA target sites were found to fall within the coding regions (86% in Ath), whereas the 5' and 3'UTR regions are targeted by approximately 7% (in Ath).
Conserved miRNA families were found to target on average more gene transcripts - with the average number of targets summed up across the three species Ath, Bna, and Mtr amounting to 161.4 - than their non-conserved counterparts (106.2), p = 0.073 (Mann-Whitney test). Based on the available quantitative data of miRNA expression via normalized read counts (see Methods), conserved miRNAs were found to be expressed at higher levels than non-conserved miRNAs. In Ath, the average log-2 expression value for conserved miRNAs was 9.25 and significantly higher than the corresponding value for non-conserved miRNAs (3.92, p = 2.2e-5), observed similarly in Mtr with 7.39 for conserved vs. 4.94 average log-2 expression level for non-conserved miRNAs, albeit significance could not be established (p = 0.21).
Biological process involvement of conserved/non-conserved miRNAs.
Species specific targets
GO Process Term
Unknown biological processes
Other metabolic processes
Response to stress
Other cellular processes
Electron transport or energy pathway
We performed a comparative analysis of miRNAs in four different plant species (Ath, Bna, Mtr, and Cre). Our results confirm previous results that miRNA evolution appears to be rapid suggesting a significant participation of miRNAs in species-specific processes [8, 9]. The observation that species-specific miRNAs and their targets appear to be involved in processes involving interactions with the environment, such as signal transduction and stress response (Table 2) supports the notion that miRNAs are an important level of regulation at the speciation level as every species will have their very own environment to cope with. The observation that genes involved in "unknown biological processes" were also found overrepresented in the set of target genes of non-conserved miRNAs may either suggest that there are still many species-specific genes not properly characterized yet, or that those miRNA-target associations are spurious in the sense that the annotation of the genes and/or the identification of the miRNA may have been incorrect.
Small RNA sequencing data was analyzed to assess conservation not only at the sequence, but also at the expression level with the conclusion that miRNA expression is conserved as well. Therefore, it may be worthwhile to compare the respective cis-regulatory regions associated miRNA genes across different species and to investigate evolutionary differences and conservation patterns.
Further improvements also seem possible on the bioinformatic target prediction side. While it is clear that in silico methods may yield more predictions than miRNA-target pairs detected experimentally - as they depend on the miRNA actually being expressed - ideally, all of the experimentally found miRNA-target pairs would also be found by in silico methods.
Gene expression regulation via miRNAs in plants appears to scale with genome size and to play a predominant role in species specific adaptation processes. In cases of miRNA conservation not only is the sequence conserved, but also their expression with targeted processes associated with general, developmental programs.
We wish to thank our colleagues from the Max Planck Institute for Molecular Plant Physiology, Potsdam-Golm, for providing the experimental data used in this study. In particular, Wolf-Rüdiger Scheible (Ath, Bna data), Franziska Krajinski (Mtr data) and their teams.
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.