Performance evaluation of commercial miRNA expression array platforms
© Irizarry et al; licensee BioMed Central Ltd. 2010
Received: 26 August 2009
Accepted: 18 March 2010
Published: 18 March 2010
microRNAs (miRNA) are short, endogenous transcripts that negatively regulate the expression of specific mRNA targets. The relative abundance of miRNAs is linked to function in vivo and miRNA expression patterns are potentially useful signatures for the development of diagnostic, prognostic and therapeutic biomarkers.
We compared the performance characteristics of four commercial miRNA array technologies and found that all platforms performed well in separate measures of performance.
The Ambion and Agilent platforms were more accurate, whereas the Illumina and Exiqon platforms were more specific. Furthermore, the data analysis approach had a large impact on the performance, predominantly by improving precision.
MicroRNAs (miRNAs) are endogenous, non-coding transcripts that regulate a diverse range of functions, including development, differentiation, growth, apoptosis and metabolism. These 17-24 nucleotide RNA molecules confer specific recognition of target mRNAs and modulate gene expression by acting in conjunction with a set of effector proteins of the RNA interference pathway [1, 2]. Through this interaction, miRNAs negatively regulate expression of specific target mRNAs by inhibiting translation, sequestering transcripts in P-bodies , or by accelerating mRNA decay as a consequence of rapid deadenylation. Moreover, miRNAs have recently been proposed to activate translation of mRNAs under certain conditions .
The relative abundance of miRNAs in cells is thought to be important for miRNAs to exert their regulatory function. For example, titrated expression of both genomic copies of mouse miR-1 is required for normal heart formation and function during embryogenesis . Aberrant miRNA expression contributes to malignancies, tumor progression and metastasis (reviewed in ), and miRNA expression profiles can be correlated with disease pathogenesis and prognosis [8, 9]. Thus, the performance characteristics of technologies that measure the relative abundance of miRNAs is important for effectively deciphering their functional roles and their potential utility as diagnostic biomarkers.
Microarray technology permits simultaneous expression measurements for hundreds of miRNAs. This technology is already widely used and promises to become a standard tool in the near future. However, a careful assessment of the technology has not yet been performed. This motivated us to evaluate performance attributes of four commercial array platforms for miRNA expression profiling. The miRNA platforms evaluated were Ambion (miRChip; a custom Affymetrix array provided as the DiscovArray™ service through Asuragen,), Agilent (Human miRNA Microarray, v 1.0, GEO accession GPL9081), Exiqon (miRCURY™ LNA Array, v 9.2, GEO accession GPL7724), and Illumina (MicroRNA Expression Profiling Panels, v 1, GEO accession GPL8178). In all cases the sample processing was performed by experienced operators working under standard operating procedures. Samples for three of the four platforms were processed by companies that provide research services on the platform. The study was administered by BIOO Scientific Corporation (Austin, TX) to ensure that the sample identities and purpose of the experiment was blinded. With the exception of the Illumina platform, the laboratory personnel did not know the experiment was part of a performance evaluation. Illumina's Sentrix® Universal-16 BeadChip arrays were used for this study instead of the Sentrix® Array Matrix, which is the manufacturers supported platform for miRNA analysis of the version 1 bead pool.
Seven synthetic miRNAs [Additional file 1] were spiked into a background of 100 ng human placenta total RNA at known input masses ranging from 1 amol to 316 amol in serial 3.16-fold increments. Seven pools of synthetic miRNAs were formulated for spiking according to a 7 × 7 Latin Square design, such that each transcript is spiked in at each concentration (including a zero mass negative control). Endogenous levels of the seven synthetic miRNAs were below the detection threshold when placenta RNA was screened on the Ambion platform. The 100 ng input of total RNA was within the vendors' recommended ranges of inputs. There were substantial differences between platforms in the coverage of miRNAs represented. To eliminate potential probe-content biases in the assessment of precision, we restricted the analysis to 330 human miRNAs represented on all four platforms, representing 45% of the 733 mature human miRNAs registered in the Sanger 10.1 sequence database .
Each company provided processed data as part of the standard service using statistical methods produced in house. We refer to these as the default data sets. They are available for download through the NCBI Gene Expression Omnibus (GEO) repository under the following accessions: GSE19248. The Exiqon default data reported the value "NA" (missing values) for 51% of the measurements associated with the spiked-in miRNAs, and 59.1% of the 330. We were, therefore, unable to analyze the Exiqon default data by the methods described, and it was not included in this report. In gene expression microarrays various academic groups have demonstrated that the use of alternative statistical methodology can substantially improve accuracy and precision of expression measurements, relative to ad-hoc procedures developed by the manufacturers of the technology . We therefore also used the raw probe-level data from all companies, with the exception of Agilent. The Agilent miRNA platform typically interrogates repeated measurements of two probes per miRNA that are summarized using a proprietary algorithm. Therefore, Agilent does not recommend using raw probe-level data for data analysis or normalization. We compared two alternative approaches to background correction to the default: no-background correction and exponential-normal convolution . We also compared quantile normalization  to the default normalization method for each platform. We refer to the processed data (in log2 scale) as expression values. We found that no-background correction and quantile normalization clearly outperformed other approaches, so we used these methods to compare platform performance. For Agilent we used the default dataset according to the vendor's recommendations. Figures using the default dataset for all platforms are included as Additional files 2, 3, 4.
We assessed specificity and sensitivity in a way that can be easily related to practical performance. The use of the same placental total RNA as background material in each hybridization permitted us to assess specificity. Spike-in experiments have been used extensively to assess gene expression technologies as they provide a sensible way of measuring sensitivity [13, 14]. However, misleading conclusions can be drawn from experiments with unusually high expression measurements for the spike-in concentrations that presumably do not represent the nominal concentrations of the background RNA . For this reason, we carefully calibrated our spike-in material to assure that the distribution of observed expression for the spike-in transcripts reflects the distributions seen in typical experiments. Additional file 2 shows the typical distribution of expression values for the background RNA for the four studied data sets. The tick marks on the x-axis represent the average expression at each reported spike-in level. This figure illustrates that the spike-in transcripts resulted in expression measurements similar to the background RNA transcripts.
BGC & QN
BGC & QN
BGC & QN
Note that in that in Figure 2, many dark blue dots were observed on each platform. This was expected given the documented problem of cross-hybridization. Because a platform with larger SD and small outliers might be preferable to one with a smaller SD but large outliers we included the 99th percentile of the null distribution as a second summary assessment of specificity. Note that for this analysis 3.3 is the expected 1% value for the 330 human mature miRNAs common to all platforms. However, the number of array features will certainly increase in the near future: the number of false positives (in the top 1%) will increase proportionally.
Precision and accuracy assessments, considered independently, have limited practical use. However, the summary statistics described above can be easily combined to answer many practical questions when posed in a statistical context. As an example, we computed the chance that, when comparing two samples, a gene with true log2 fold change, Δ = 1, will appear in a list of the top 1% (highest log-ratios). This summary statistic, as well as the accuracy and precision summaries described above are shown in Table 1. Note that Table 1 includes results for all the data analysis approaches we considered.
We have described an assessment procedure for microRNA microarray data based on a carefully designed spike-in experiments. Strengths and weaknesses were revealed for each platform. Ambion and Agilent were more accurate, while, Illumina and Exiqon were more specific. Strikingly, the data processing methods had a more profound impact on the performance than were observed for differences between platforms. The introduction of background correction adjustment to the raw data was detrimental to specificity, inferring that background correction was the likely cause of lower performance for the three default data sets. The practical implication is that false positive fold changes are most likely to be detected at lower expression signals from default data, and may be reduced by eliminating the background correction from the raw data.
We considered quantile normalization to be the best approach among multiple options for this study design because the distribution of the background RNA is identical across the project. For projects where the miRNA fraction of total RNA may be variable across different samples in the project, another normalization method may be more appropriate.
The experimental design did not include measurements of day-to-day or site-to-site variability to evaluate platform robustness, so we were not able to draw direct conclusions as to whether these platforms might have performed differently under different circumstances. Reproducibility testing of the Agilent, Ambion and Illumina platforms beyond the scope of this study suggested that the performance reported here is within the expected day-to-day variability (not shown).
Both Ambion and Agilent demonstrated good accuracy across the range tested but with less precision than the other two platforms. Agilent performed the best when only the default data set was evaluated for each platform. Considering that we adhered to Agilent's guidance to use the default data, further analysis is required to determine whether excluding the background adjustment or including a global normalization method can improve the performance of the Agilent array.
We thank Lance Ford of BIOO Scientific for administering the blinded study. The work of Rafael A. Irizarry is partially funded by 1R01GM083084-01 and 1R01RR021967-01A2. The work of Mathew McCall is partially funded by T32GM074906.
Trade Marks: Asuragen and DiscovArray are trademarks of Asuragen, Inc.; miRCURY™ is a trademark of Exiqon, Inc.; Illumina is a trademark and Sentrix is a registered trademark of Illumina, Inc., Exiqon, miRCURY and LNA are registered trademarks of Exiqon A/S.
- Bartel DP: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004, 116: 281-297. 10.1016/S0092-8674(04)00045-5.PubMedView ArticleGoogle Scholar
- Gregory R, Chendrimada T, Cooch N, Shiekhattar R: "Human RISC couples microRNA biogenesis and posttranscriptional gene silencing". Cell. 2005, 123 (4): 631-40. 10.1016/j.cell.2005.10.022.PubMedView ArticleGoogle Scholar
- Liu J, Valencia-Sanchez MA, Hannon GJ, Parker R: MicroRNA-dependent localization of targeted mRNAs to mammalian P-bodies. Nat Cell Biol. 2005, 7: 719-23. 10.1038/ncb1274.PubMed CentralPubMedView ArticleGoogle Scholar
- Wu L, Fan J, Belasco JG: MicroRNAs direct rapid deadenylation of mRNA. Proc Natl Acad Sci USA. 2006, 103: 4034-9. 10.1073/pnas.0510928103.PubMed CentralPubMedView ArticleGoogle Scholar
- Vasudevan S, Tong Y, Steitz JA: Switching from Repression to Activation: MicroRNAs Can Up-Regulate Translation. Science. 2007, 318: 1931-1934. 10.1126/science.1149460.PubMedView ArticleGoogle Scholar
- Zhao Y: Dysregulation of cardiogenesis, cardiac conduction, and cell cycle in mice lacking miRNA-1-2. Cell. 2007, 129: 303-17. 10.1016/j.cell.2007.03.030.PubMedView ArticleGoogle Scholar
- Esquela-Kerscher A, Slack FJ: Oncomirs - microRNAs with a role in cancer. Nat Rev Cancer. 2006, 6: 259-269. 10.1038/nrc1840.PubMedView ArticleGoogle Scholar
- Schetter AJ: MicroRNA expression profiles associated with prognosis and therapeutic outcome in colon adenocarcinoma. Jama. 2008, 299: 425-36. 10.1001/jama.299.4.425.PubMed CentralPubMedGoogle Scholar
- Yu SL: MicroRNA signature predicts survival and relapse in lung cancer. Cancer Cell. 2008, 13: 48-57. 10.1016/j.ccr.2007.12.008.PubMedView ArticleGoogle Scholar
- Griffiths-Jones S: miRBase: the microRNA sequence database. Methods Mol Biol. 2006, 342: 129-38.PubMedGoogle Scholar
- Yang YH: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002, 30: e15-10.1093/nar/30.4.e15.PubMed CentralPubMedView ArticleGoogle Scholar
- Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003, 19: 185-93. 10.1093/bioinformatics/19.2.185.PubMedView ArticleGoogle Scholar
- Lockhart DJ: Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996, 14: 1675-80. 10.1038/nbt1296-1675.PubMedView ArticleGoogle Scholar
- Hughes TR: Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol. 2001, 19: 342-7. 10.1038/86730.PubMedView ArticleGoogle Scholar
- Irizarry RA, Cope LM, Wu Z: Feature-level exploration of a published Affymetrix GeneChip control dataset. Genome Biol. 2006, 7: 404-10.1186/gb-2006-7-8-404.PubMed CentralPubMedView ArticleGoogle Scholar
- Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP: A benchmark for Affymetrix GeneChip expression measures. Bioinformatics. 2004, 20: 323-31. 10.1093/bioinformatics/btg410.PubMedView ArticleGoogle Scholar
- Irizarry RA, Wu Z, Jaffee HA: Comparison of Affymetrix GeneChip expression measures. Bioinformatics. 2006, 22: 789-94. 10.1093/bioinformatics/btk046.PubMedView ArticleGoogle Scholar