Analytical variables influencing the performance of a miRNA based laboratory assay for prediction of relapse in stage I non-small cell lung cancer (NSCLC)

Dahlgaard, Jesper; Mazin, Wiktor; Jensen, Thomas; Pøhl, Mette; Bshara, Wiam; Hansen, Anker; Kanisto, Eric; Hamilton-Dutoit, Stephen Jacques; Hansen, Olfred; Hager, Henrik; Ditzel, Henrik J; Yendamuri, Sai; Knudsen, Steen

doi:10.1186/1756-0500-4-424

Research article
Open access
Published: 19 October 2011

Analytical variables influencing the performance of a miRNA based laboratory assay for prediction of relapse in stage I non-small cell lung cancer (NSCLC)

Jesper Dahlgaard^1,5,
Wiktor Mazin¹,
Thomas Jensen¹,
Mette Pøhl^3,4,
Wiam Bshara⁶,
Anker Hansen¹,
Eric Kanisto²,
Stephen Jacques Hamilton-Dutoit⁵,
Olfred Hansen³,
Henrik Hager⁵,
Henrik J Ditzel^3,4,
Sai Yendamuri² &
…
Steen Knudsen¹

BMC Research Notes volume 4, Article number: 424 (2011) Cite this article

4278 Accesses
7 Citations
Metrics details

Abstract

Background

Laboratory assays are needed for early stage non-small lung cancer (NSCLC) that can link molecular and clinical heterogeneity to predict relapse after surgical resection. We technically validated two miRNA assays for prediction of relapse in NSCLC. Total RNA from seventy-five formalin-fixed and paraffin-embedded (FFPE) specimens was extracted, labeled and hybridized to Affymetrix miRNA arrays using different RNA input amounts, ATP-mix dilutions, array lots and RNA extraction- and labeling methods in a total of 166 hybridizations. Two combinations of RNA extraction- and labeling methods (assays I and II) were applied to a cohort of 68 early stage NSCLC patients.

Results

RNA input amount and RNA extraction- and labeling methods affected signal intensity and the number of detected probes and probe sets, and caused large variation, whereas different ATP-mix dilutions and array lots did not. Leave-one-out accuracies for prediction of relapse were 63% and 73% for the two assays. Prognosticator calls ("no recurrence" or "recurrence") were consistent, independent on RNA amount, ATP-mix dilution, array lots and RNA extraction method. The calls were not robust to changes in labeling method.

Conclusions

In this study, we demonstrate that some analytical conditions such as RNA extraction- and labeling methods are important for the variation in assay performance whereas others are not. Thus, careful optimization that address all analytical steps and variables can improve the accuracy of prediction and facilitate the introduction of microRNA arrays in the clinic for prediction of relapse in stage I non-small cell lung cancer (NSCLC).

Background

Early stage non-small cell lung (NSCLC) cancer is characterized by both clinical and molecular genetic heterogeneity with five-year recurrence and survival rates of 50% and 73% respectively [1]. Although several randomized studies have been performed, the use of adjuvant chemotherapy for stage I NSCLC still is controversial [2] and surgical resection remains the primary treatment for this disease.

However, in spite of tumor heterogeneity, new techniques in molecular profiling [3–5] can supplement clinical and pathologic observations and help to identify patients with a particularly poor prognosis. This can be useful both for intensified follow-up and for administering therapy specifically to patients at a high risk of recurrence [6].

In this study, we performed global microarray expression profiling targeting several small non-coding RNA species including microRNAs (miRNAs). MicroRNAs are small noncoding RNAs of approximately 18-25 nucleotides in length that regulate gene expression at the post transcriptional level by base pairing with mRNA, leading to either translational repression [7], or mRNA degradation [8–10]. MicroRNAs have been estimated to regulate up to 30% of all human genes [11], and frequently reside in cancer associated genomic regions [12]. Deregulation of miRNA expression plays a direct role in oncogenesis, and in differentiation and progression in cancer, in part because deregulation can change the expression of oncogenes and tumour suppressor genes [13]. Strong deregulation of miRNA expression has been seen in several forms of cancer, including lung carcinoma [4], and several studies have suggested that miRNA profiling can be used for prognostication in lung cancer [3–6].

The enhanced stability of microRNAs in contrast to mRNAs, allow expression profiling in routinely stored formalin-fixed and paraffin-embedded (FFPE) specimens, including samples that are more than ten years old [14]. Large FFPE archives exist in diagnostic pathology departments throughout the world. When linked to clinical data, they represent an invaluable biobank resource for exploring the association between molecular changes in tumors and clinical endpoints such as relapse or survival after surgery. Furthermore in the case of early stage NSCLC, FFPE specimens will be available for most patients. Therefore, it is realistic to use miRNAs and non-coding RNAs as biomarkers for prognosis in stage I NSCLC, once a prognostic signature has been clinically validated.

In order to reach this goal, carefully conducted studies are needed [15, 16], incorporating well defined experimental procedures that may eventually lead to the development of clinically validated applications allowing for individual treatment strategies in early stage NSCLC. Previously, the Microarray Quality Control (MAQC) study [17] focused on the entire process from sample handling, through laboratory and assay conditions, to data normalization and bioinformatics. This demonstrated the scope and significant potential of microarray technology for the clinic [18] when performed under careful and well-defined experimental conditions.

In this study, we compared two laboratory assays for prognostication in stage I NSCLC based on miRNA profiling in FFPE tissue specimens. To perform an objective evaluation [16], of the different reagents, array products and protocols we examined several analytical conditions (figure 1) including: i) 7 different RNA input amounts using one RNA preparation of a single tumor specimen, ii) three different ATP-mix dilutions using two RNA preparations of two tumor specimens, iii) two different array lot numbers using one RNA preparation of a single tumor specimen, iv) two different RNA extraction kits using eight RNA preparations of four tumor specimens, and, v) two different RNA labeling kits using four RNA preparations of four tumor specimens in 8 labeling reactions. In addition, RNA was extracted twice from the same specimens in a cohort of more than 60 NSCLC patients in a direct comparison of the two assays. Thus, 139 RNA extractions and 166 hybridizations were performed from a total of 75 NSCLC specimens. To qualify the impact of any variation in the assay specific analytical conditions, principal component analysis was performed. In addition, prognosticator calls (i.e. "recurrence" or "no recurrence") was examined after varying the analytical conditions for selected samples.

Results

RNA input amount

A linear regression model showed that the amount of purified small RNA used for hybridization significantly affected mean signal intensity; the number of detected probes; and the number of detected probe sets (b_signal = 0.03 ± 0.01, t = 2.6, p < 0.05, R² = 0.58; b_probes = 3.64 ± 0.60, t = 6.0, p < 0.01, R² = 0.88; b_{probe sets} = 0.94 ± 0.12, t = 7.6, p < 0.001, R² = 0.88; figure 2, 3 and 4). The amount of total RNA used for hybridization also affected the number of detected probes (mean_{100 ng} = 3674 vs. mean_{600 ng} = 4627, t = -2.04, df = 18, p = 0.05; figure 5) and the number of detected probe sets (mean_{100 ng} = 803 vs. mean_{600 ng} = 1134, t = -2.41, df = 18, p < 0.05; figure 6). Self-self correlations in probe signal intensities between arrays hybridized to different amounts of the same RNA preparation varied across different combinations of RNA input amount (table 1). In addition, a linear regression analysis revealed that self-self correlations in probe signal intensities decreased when ratios in RNA input amounts between pairs increased, considering all pair wise combinations (b_DevRNAinput = -0.0042 ± 0.0004, t = -10.0, p < 0.001, R² = 0.95; figure 7).

Table 1 Correlations in signal intensities across probes from different arrays that were hybridized to four different amounts of RNA (ATP-mix dilution, 1:50) from a single preparation of a T2 NSCLC tumor.

Full size table

ATP-mix dilution

The effect of ATP-mix dilution was not significant in a linear regression model (results not shown), when analyzing six hybridizations with RNA from two NSCLC specimens, each labeled using three different ATP-mix dilutions. Thus, mean signal intensity, background intensity, the numbers of detected probes, and the numbers of detected probe sets were stable across the tested range (table 2). Self-self correlations coefficients in probe signal intensities between arrays with RNA labeled at different ATP-mix dilutions were invariant across the tested range (table 3). Thus, there were no association between self-self correlations and the ratio of ATP-mix dilutions among arrays hybridized to RNA labeled at different ATP-mix dilutions, considering all pair wise combinations (results not shown).

Table 2 Mean( ± SEM) signal intensity, background intensity, number of detected probes and number of detected probe sets in hybridizations using 600 ng RNA from two NSCLC specimens (duplicates) each labeled using three different ATP-mix dilutions (1:50; 1:150 and 1:500).

Full size table

Table 3 Self-self correlations (± SD) in signal intensities across probes in six hybridizations using 600 ng RNA from two NSCLC specimens with each specimen labeled with three different ATP-mix dilutions (1:50; 1:150 and 1:500).

Full size table

Different chip lot numbers

Hybridizations (in triplicates) with labeled RNA from a single T2 NSCLC tumor revealed that signal intensity and the number of probes and probe sets were not significantly different across different lot numbers (results not shown). In addition, the observed self-self correlation coefficient across probe signal intensities within and between lots did not vary (table 4). In particular, the self-self correlation coefficient across probe signal intensities within one lot of arrays (cc = 0.973) was similar to the estimated average correlation between two different lots of arrays (cc = 0.965; 95% C.I. = 0.92-1.01).

Table 4 Correlations in signal intensities across probes from two different lots of arrays that were hybridized (in triplicates) to 100 ng of labeled RNA (ATP-mix dilution 1:50) from of a single RNA preparation of a T2 NSCLC tumor.

Full size table

Comparisons of two different RNA extraction kits

Mean intensity ± se (x_RecoverAll = 247.7 ± 26.9 vs. x_HighPure = 190.1 ± 8.2), the number of detected probes ± se (x_RecoverAll = 9407.8 ± 98.0 vs. x_HighPure = 7733.5 ± 671.1) and the number of detected probe sets ± se (x_RecoverAll = 2328.8 ± 27.9 vs. x_HighPure = 2088.5 ± 150.0) in hybridizations with total RNA extracted using the RA kit all exceeded that for the HP kit (figure 8 and 9), although this was significant only for the numbers of detected probes (ANOVA; F_1,6 = 6.09, P < 0.05). Background intensity was not significantly different between the kits (x_RecoverAll = 56.3 ± 2.5 vs. x_HighPure = 51.7 ± 2.0; figure 8). PCA, considering the expression of all human ncRNAs, as well as that of a specific miRNA signature for prognostication, demonstrated that a major proportion of the variance could be assigned to the two RNA extraction methods (i.e. between-kit variance) as revealed by the first principal component (PC1; figure 10 and 11).

Comparisons of two different RNA labeling kits

For hybridizations with RNA labeled using the FlashTag Biotin HSR labeling kit; mean intensity (x ± se _{FlashTag HSR} = 215.1 ± 18.1 vs. x ± se _FlashTag = 68.3 ± 2.9; ANOVA; F_1,6 = 64.4, P < 0.001), the numbers of detected probes (x ± se _{FlashTag HSR} = 8153 ± 302 vs. x ± se _FlashTag = 6079 ± 323; ANOVA; F_1,6 = 21.9, P < 0.01) and the numbers of detected probe sets (x ± se _{FlashTag HSR} = 2139 ± 87 vs. x ± se _FlashTag= 1527 ± 61; ANOVA; F_1,6 = 33.5, P < 0.01), all exceeded those in hybridizations with RNA labeled using the FlashTag Biotin labeling kit (figure 12 and 13). Background intensity was significantly different between the kits (x ± se _{FlashTag HSR} = 77.6 ± 8.0 vs. x ± se _FlashTag= 34.8 ± 0.9; ANOVA; F_1,6 = 28.1, P < 0.01). A major proportion of the variance could be assigned to the different labeling methods (between-kit variance) as revealed by PC1 in the PCA (figure 14 and 15). PCA also revealed that the variance for samples labeled with the old labeling methods was very small (compressed).

Assay I and II for prognostication in stage I NSCLC samples

By performing 1000 Monte Carlo simulations we obtained a prognostic accuracy of 60.0% (95% C.I.: 59.5% - 60.5%) for assay I and 62.6% (95% C.I.: 61.9% - 63.2%) for assay II (p = 9.82e-10 for the hypothesis that the accuracy is similar for the two assays). Nested LOOCV that optimized the number of selected non-coding RNAs in a separate loop resulted in an LOOCV accuracy of 63% for assay I and 73% for assay II. A multivariate analysis examined for the effects of the miRNA chip based prognosis (i.e. "recurrence" or "no recurrence"), age, smoking status, stage (Ia or Ib) and histology (squamous or adeno) on recurrence after surgery. Only the miRNA based prognosticator was significant (P_{miRNA Prognosis} = 0.009; P_Age = 0.656, P_Smoking = 0.146, P_Stage = 0.921, P_Histology = 0.732). Figure 16 shows the predictions against a Kaplan-Meier time-to recurrence plot (LOOCV accuracy of 73%, p < 0.001). The two miRNA lists obtained did not overlap and the list obtained from one assay could not predict the other assay.

Impact of the analytical conditions on the robustness of the prognosticator

Prognosticator calls (i.e. "recurrence" or "no recurrence") were consistent independent of the RNA amount, ATP-mix dilution, chip lot number and RNA extraction method being used. In contrast, the calls were not robust to changes in labeling method (table 5).

Table 5 Prognosticator calls (0 ="no recurrence" or 1 = "recurrence") were examined for varying RNA input amounts (a single T2 NSCLC specimen), ATP-mix dilutions (two NSCLC specimens), chip lot numbers (a single T2 NSCLC specimen), RNA extraction kits (four NSCLC specimens; HP = HighPure, Roche &RA = RecoverAll, Ambion) and RNA labeling kits (four NSCLC specimens; F = Flashtaq and FH = Flashtaq HSR, Genisphere) for selected samples using the prognostic profile of assay II.

Full size table

Discussion

Validation of a microarray based laboratory assay poses two technical challenges; first, ensuring that data are aquired with the best laboratory proficiency; and second, that data are analyzed appropriately. In order for a chip based prognostic assay to be practically usefull and accurate for prognostication in NSCLC, concern must be adressed towards the concordance of expression measurements and the impact of variation across analytical conditions. Here we assessed the impact of variation in several analytical conditions including varying RNA input amount, ATP-mix dilution, chip lot numbers, RNA extraction- and RNA labeling kit.