E2F1 and KIAA0191 expression predicts breast cancer patient survival
© Hassell et al; licensee BioMed Central Ltd. 2011
Received: 26 October 2010
Accepted: 31 March 2011
Published: 31 March 2011
Gene expression profiling of human breast tumors has uncovered several molecular signatures that can divide breast cancer patients into good and poor outcome groups. However, these signatures typically comprise many genes (~50-100), and the prognostic tests associated with identifying these signatures in patient tumor specimens require complicated methods, which are not routinely available in most hospital pathology laboratories, thus limiting their use. Hence, there is a need for more practical methods to predict patient survival.
We modified a feature selection algorithm and used survival analysis to derive a 2-gene signature that accurately predicts breast cancer patient survival.
We developed a tree based decision method that segregated patients into various risk groups using KIAA0191 expression in the context of E2F1 expression levels. This approach led to highly accurate survival predictions in a large cohort of breast cancer patients using only a 2-gene signature.
Our observations suggest a possible relationship between E2F1 and KIAA0191 expression that is relevant to the pathogenesis of breast cancer. Furthermore, our findings raise the prospect that the practicality of patient prognosis methods may be improved by reducing the number of genes required for analysis. Indeed, our E2F1/KIAA0191 2-gene signature would be highly amenable for an immunohistochemistry based test, which is commonly used in hospital laboratories.
Traditionally, a variety of clinical and histopathological characteristics have been employed to make predictions regarding the potential clinical outcomes of breast cancer patients. However, the advent of gene expression profiling technologies has enabled the use of molecular signatures to provide improved predictions of clinical outcome over traditional methods [1–5]. These signatures typically comprise many genes and require profiling their expression by measuring the abundance of their respective mRNA transcripts [3–5]. A major issue concerning the use of molecular signatures to provide prognostic information for cancer patients, is that transcript profiling tests require personnel with specialized training, as well as expensive reagents and equipment. These platforms are not routinely available in hospital pathology laboratories, which necessitates shipping tumor samples to an appropriately equipped laboratory, thereby increasing the time and cost of carrying out these tests. We hypothesize that identifying gene signatures that comprise 2-3 genes would enable the development of highly practical immunohistochemical based tests, which are commonly used in hospital based pathology laboratories.
Because the expression of proliferation associated genes has been shown to group breast cancer patients into good and poor risk groups , we sought to identify genes whose expression could increase the predictive accuracy of the proliferation gene, E2F1. E2F1 encodes a transcription factor that regulates the expression of target genes whose products participate in numerous processes such as DNA replication, mitotic check point, mitosis, DNA damage checkpoints, and DNA repair [6–8]. Generally, E2F1 is bound to and functionally inactivated by pRB; however, proliferative signals induce the phosphorylation of pRB by cyclinD/CDK4/6 complexes leading to the dissociation of pRB from E2F1, and the subsequent activation of E2F1 target genes . In line with these observations, over-expression of E2F1 or various other members of the E2F gene family forces the re-entry of quiescent cells into S phase .
Using an algorithm we published recently , we found that the expression of KIAA0191 transcripts can be used in conjunction with those of E2F1 to more accurately predict breast cancer patient survival than does E2F1 expression alone. KIAA0191, commonly known as TUT4 or ZCCHC11, encodes a canonical poly (A) polymerase, whose function involves the polyadenylation of pre-mRNA in the nucleus . KIAA0191 has been shown to work in concert with Lin28 to suppress microRNA biogenesis through uridylation of pre-microRNA. Importantly, KIAA0191 function has not been previously linked to E2F1. Here we demonstrate that the expression of KIAA0191 transcripts alone is not related to breast cancer patient survival. However, in the context of average to high expression of E2F1 transcript levels, high KIAA0191 expression was linked to poor breast cancer patient prognosis, whereas low KIAA0191 expression was linked to good outcome for these patients. Interestingly, our study identified a potentially novel functional relationship between E2F1 and KIAA0191, which may be of clinical relevance to breast cancer patients.
Microarray and clinical data
We used data from the Stanford microarray repository (downloaded from http://microarray-pubs.stanford.edu/wound_NKI/explore.html) for our analyses. We also downloaded a matrix containing clinical data for the patients that provided samples for the microarray profiles used in the present study from the same location. We created a master data matrix by combining the gene expression profiles with indices for survival and metastasis for each patient. Patients included within this cohort had either stage I or II breast cancer and were less than 53 years of age. The prevalence of lymph-node positive and lymph-node negative disease was approximately 50% for each, respectively.
Identification of genes that enhance the predictive power of E2F1
To discover genes that might improve the capacity of E2F1 transcript levels to predict the prognosis of human breast cancer patients, we first ranked the level of gene expression for each gene in every patient's breast tumor as described previously. We then adapted a similar approach to that we used previously, but instead of searching for genes whose expression was related to patient survival , we modified the algorithm to search for genes whose expression was predictive of patient survival in combination with that of E2F1. We then ranked all the genes present in the expression profiles using a scoring technique published previously .
Survival and statistical analysis
Unless otherwise indicated all survival analyses and associated statistical tests were completed using GraphPad Prism 5™ software. Harrell's concordance-index (C-index) was calculated using the Hmisc package in R .
Selection of random genes
Randomly selected genes were obtained by using a random number generator (http://www.random.org).
E2F1 expression accurately groups patients into good and poor outcome groups
We sought to improve the capacity of a small number of genes to correctly divide breast cancer patients into good and poor prognosis groups. We started with a candidate gene approach, a methodology used in previous studies . We chose to begin with E2F1, as its transcript levels are reportedly prognostic in human breast cancer , and because the E2F1 protein stimulates tumor cell proliferation, a process that is inversely correlated with breast cancer patient survival [6, 8, 14–16]. We also imagined that genes whose expression enhanced the prognostic power of E2F1 transcript levels to predict patient survival, might uncover genes whose products interacted directly or indirectly with E2F1.
Because E2F1 transcript abundance alone was not completely accurate at classifying patients into good and poor prognosis groups, we sought to identify other genes whose expression could augment the predictive power of E2F1 transcript levels. We first defined high, average, and low E2F1 expression based on expression above, within, or below the 95% confidence interval for E2F1 expression among all 295 patients. We then took a modified approach from that which we developed previously  to find genes that were i) generally highly expressed in tumors where high to average E2F1 expression was indicative of poor patient survival, and ii) generally were expressed at low levels in tumors where high-average E2F1 expression was not associated with poor patient survival. The mostly highly ranked candidate among the 295 patient cohort was KIAA0191, which is also commonly known as TUT4 or ZCCHC11.
Integration of KIAA0191 into E2F1 expression based prognosis decision-making
Prognostic tests, which identify high and low risk cases of breast cancer, are greatly beneficial for identifying patients who can be spared unnecessary chemotherapy. For example, several clinical trials, including the National Surgical Adjuvant Breast and Bowel Project trials B-14 and B-20, have shown that adding chemotherapy to tamoxifen treatment increases survival in node-negative, estrogen-receptor-positive breast cancer patients [21–23]. However, the 10 year recurrence rate with tamoxifen treatment alone is only 15%, therefore if all patients were to equally receive additional chemotherapy, it would result in 85% of patients receiving little chemotherapy-derived benefit but nonetheless suffering its deleterious side effects.
In attempt to spare patients unnecessary chemotherapy, treatment decisions have traditionally been made based primarily on classical histopathological and immunohistochemical techniques. However, within the last several years, many genomic based molecular signatures have been derived that correlate gene expression in tumor tissue to breast cancer recurrence [2–5]. Importantly, many of these gene signatures more accurately assign risk to breast cancer patients than conventional criteria. However, a practical limitation of these signatures is that assays of transcript abundance require relatively intact RNA, as well as expensive equipment and technical expertise, which is unavailable in most hospital pathology laboratories. Hence, tumor specimens are commonly shipped to specialized clinical laboratories thereby increasing the turn-around time and cost of these tests. For these reasons, we sought to determine whether we could generate relatively small gene signatures (2-3 genes), which might yield accurate prognostic information. Indeed, a signature comprising 2-3 genes might be developed into an immunohistochemistry assay, which could be carried out in hospital-based pathology laboratories thereby saving both time and cost.
We began our experiments by choosing a single gene using a candidate gene approach. Because tumor cell proliferation is linked to poor survival in breast cancer patients, we first tested whether the expression of the single "proliferation" gene, E2F1, was also linked to survival in breast cancer patients [5, 6, 8, 9, 15, 16, 18]. The observation that high expression of E2F1 transcripts indicated poor overall patient survival in the dataset used for this study is unsurprising, given that tumor cell proliferation is associated with poor patient survival in other large breast cancer patient datasets, and low E2F1 transcript levels have previously been linked to good patient survival [13, 18].
We next sought to identify additional genes whose expression might augment the predictive accuracy of E2F1 expression such that a highly accurate 2-gene signature might be developed. Indeed, such genes would be useful for increasing the accuracy of genomic based clinical outcome predictors, as well as understanding E2F1 based proliferation programs in breast cancer cells. Our analyses revealed that KIAA0191 transcript abundance could be used in the context of average to high levels of E2F1 transcripts to more precisely predict breast cancer patient survival. However, in the context of low E2F1 transcript levels, KIAA0191 expression was not linked to patient outcome. These results suggest that there is a relationship between E2F1 and KIAA0191 expression, which is predictive of patient outcome, and that there is a likely complementary involvement of both genes in breast cancer progression. Importantly, the observation that the expression of other proliferation genes, such as AURKA[18, 20], BUB1[17, 19, 24] could be used to replace E2F1, suggests that the relationship of KIAA0191 expression to patient survival is linked to cell proliferation. Indeed, these observations highlight that there is a potentially novel functional relationship between cell proliferation and KIAA0191. Indeed, this relationship appears to be important for the pathogenesis of breast cancer and is a topic that warrants further investigation.
Using the data available for this study it wasn't possible to measure the exact predictive accuracy of our 2-gene signature in an unbiased manner. From our initial analyses the predictive power of the E2F1/KIAA0191 2-gene signature looks quite promising (High vs Low, HR: 10.2 [5.5-18.9], Harrell's C-index: 0.75, Medium vs Low, HR: 4.3 [2.2-8.4], Harrell's C-index: 0.71), however future studies will need to replicate these findings using independent gene expression data sets.
An advantage of our 2-gene signature over currently available prognostic signatures is that it may be suitable for development as an immunohistochemical based test. As mentioned previously, immunohistochemical based tests are faster, cheaper, and have greater availability to patients, than the currently available mRNA based tests. Furthermore, antibodies that `recognize E2F1 and KIAA0191 are commercially available, and several protocols exist for the quantification of protein expression using immunohistochemistry . However, there are significant differences in the technology platforms used for gene and protein expression assays (differences in dynamic range, linearity of relationship to clinical outcome), and therefore, genes which perform well using mRNA based expression profiling technology may/may not perform as well using a protein expression based immunohistochemical test . Beyond this issue, the exact correlation between mRNA and protein expression remains poorly studied, although some initial work suggests that the correlation is significant . As a result, it is important to note that this aspect of our study remains largely theoretical, as it is unclear how well such an immunohistochemical test would work for patient prognosis. To this end, validation of the 2-gene signatures using immunohistochemistry is a major focus of our current studies.
A major implication of this study is that it is important to understand the context in which a gene's expression is most highly related to patient survival. For example, we observed that high E2F1 expression was most related to poor patient outcome when that patient's tumor also expressed high levels of KIAA0191. When KIAA0191 was not expressed at high levels, the relationship between high E2F1 transcript levels and poor outcome was significantly reduced. In line with these observations, average levels of E2F1 expression were associated with poor patient outcome when KIAA0191 was highly expressed, and good patient outcome when KIAA0191 was expressed at low levels. Indeed, we took advantage of this relationship to generate a 2-gene based decision tree, which made highly accurate predictions about patient outcome, while only taking into account the expression of 2 genes.
We envision that the identification of gene signatures, which are highly predictive, but consist of relatively few genes (2-3 genes), would allow the use of immunohistochemical or immunofluorescent based assays that are commonly used in hospital-based pathology laboratories to readily guide the use of chemotherapeutics in breast cancer patients. Importantly, immunohistochemical or immunofluorescent testing does not require long distance transfer of tumor samples to molecular profiling facilities (as is the case for MammaPrint™ and Oncotype DX) and thus would provide a less time-consuming and less costly means of providing prognostic information to breast cancer patients.
This work was generously supported by a grant from the Canadian Stem Cell Network. We graciously thank Dr. Anna Dvorkin for helpful statistical analysis.
- Sotiriou C, Pusztai LC: Gene-expression signatures in breast cancer. N Engl J Med. 2009, 360 (8): 790-800. 10.1056/NEJMra0801289.PubMedView Article
- van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, et al: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002, 347 (25): 1999-2009. 10.1056/NEJMoa021967.PubMedView Article
- van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415 (6871): 530-536.PubMedView Article
- Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, et al: A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004, 351 (27): 2817-2826. 10.1056/NEJMoa041588.PubMedView Article
- Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, et al: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006, 98 (4): 262-272. 10.1093/jnci/djj052.PubMedView Article
- DeGregori J, Leone G, Miron A, Jakoi L, Nevins JR: Distinct roles for E2F proteins in cell growth control and apoptosis. Proc Natl Acad Sci USA. 1997, 94 (14): 7245-7250. 10.1073/pnas.94.14.7245.PubMedPubMed CentralView Article
- Polager S, Ginsberg D: E2F - at the crossroads of life and death. Trends Cell Biol. 2008, 18 (11): 528-535. 10.1016/j.tcb.2008.08.003.PubMedView Article
- DeGregori J, Kowalik T, Nevins JR: Cellular targets for activation by the E2F1 transcription factor include DNA synthesis- and G1/S-regulatory genes. Mol Cell Biol. 1995, 15 (8): 4215-4224.PubMedPubMed CentralView Article
- DeGregori J, Johnson DG: Distinct and Overlapping Roles for E2F Family Members in Transcription, Proliferation and Apoptosis. Curr Mol Med. 2006, 6 (7): 739-748.PubMed
- Hallett RM, Dvorkin A, Gabardo CM, Hassell JA: An algorithm to discover gene signatures with predictive potential. J Exp Clin Cancer Res. 29 (1): 120-10.1186/1756-9966-29-120.
- Heo I, Joo C, Kim YK, Ha M, Yoon MJ, Cho J, Yeom KH, Han J, Kim VN: TUT4 in concert with Lin28 suppresses microRNA biogenesis through pre-microRNA uridylation. Cell. 2009, 138 (4): 696-708. 10.1016/j.cell.2009.08.002.PubMedView Article
- Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA: Evaluating the yield of medical tests. JAMA. 1982, 247 (18): 2543-2546. 10.1001/jama.247.18.2543.PubMedView Article
- Vuaroqueaux V, Urban P, Labuhn M, Delorenzi M, Wirapati P, Benz CC, Flury R, Dieterich H, Spyratos F, Eppenberger U, et al: Low E2F1 transcript levels are a strong determinant of favorable breast cancer outcome. Breast Cancer Res. 2007, 9 (3): R33-10.1186/bcr1681.PubMedPubMed CentralView Article
- Ivshina AV, George J, Senko O, Mow B, Putti TC, Smeds J, Lindahl T, Pawitan Y, Hall P, Nordgren H, et al: Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res. 2006, 66 (21): 10292-10301. 10.1158/0008-5472.CAN-05-4414.PubMedView Article
- Slansky JE, Farnham PJ: Introduction to the E2F family: protein structure and gene regulation. Curr Top Microbiol Immunol. 1996, 208: 1-30.PubMed
- Dai H, van't Veer L, Lamb J, He YD, Mao M, Fine BM, Bernards R, van de Vijver M, Deutsch P, Sachs A, et al: A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. Cancer Res. 2005, 65 (10): 4059-4066. 10.1158/0008-5472.CAN-04-3953.PubMedView Article
- Klebig C, Korinth D, Meraldi P: Bub1 regulates chromosome segregation in a kinetochore-independent manner. J Cell Biol. 2009, 185 (5): 841-858. 10.1083/jcb.200902128.PubMedPubMed CentralView Article
- Haibe-Kains B, Desmedt C, Sotiriou C, Bontempi G: A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all?. Bioinformatics. 2008, 24 (19): 2200-2208. 10.1093/bioinformatics/btn374.PubMedPubMed CentralView Article
- Williams GL, Roberts TM, Gjoerup OV: Bub1: escapades in a cellular world. Cell Cycle. 2007, 6 (14): 1699-1704. 10.4161/cc.6.14.4493.PubMedView Article
- Zhou H, Kuang J, Zhong L, Kuo WL, Gray JW, Sahin A, Brinkley BR, Sen S: Tumour amplified kinase STK15/BTAK induces centrosome amplification, aneuploidy and transformation. Nat Genet. 1998, 20 (2): 189-193. 10.1038/2496.PubMedView Article
- Fisher B, Costantino J, Redmond C, Poisson R, Bowman D, Couture J, Dimitrov NV, Wolmark N, Wickerham DL, Fisher ER, et al: A randomized clinical trial evaluating tamoxifen in the treatment of patients with node-negative breast cancer who have estrogen-receptor-positive tumors. N Engl J Med. 1989, 320 (8): 479-484. 10.1056/NEJM198902233200802.PubMedView Article
- Fisher B, Dignam J, Wolmark N, DeCillis A, Emir B, Wickerham DL, Bryant J, Dimitrov NV, Abramson N, Atkins JN, et al: Tamoxifen and chemotherapy for lymph node-negative, estrogen receptor-positive breast cancer. J Natl Cancer Inst. 1997, 89 (22): 1673-1682. 10.1093/jnci/89.22.1673.PubMedView Article
- Fisher B, Jeong JH, Bryant J, Anderson S, Dignam J, Fisher ER, Wolmark N: Treatment of lymph-node-negative, oestrogen-receptor-positive breast cancer: long-term findings from National Surgical Adjuvant Breast and Bowel Project randomised clinical trials. Lancet. 2004, 364 (9437): 858-868. 10.1016/S0140-6736(04)16981-X.PubMedView Article
- Boyarchuk Y, Salic A, Dasso M, Arnaoutov A: Bub1 is essential for assembly of the functional inner centromere. J Cell Biol. 2007, 176 (7): 919-928. 10.1083/jcb.200609044.PubMedPubMed CentralView Article
- Allred DC, Harvey JM, Berardo M, Clark GM: Prognostic and predictive factors in breast cancer by immunohistochemical analysis. Mod Pathol. 1998, 11 (2): 155-168.PubMed
- Kim C, Paik S: Gene-expression-based prognostic assays for breast cancer. Nat Rev Clin Oncol. 7 (6): 340-347. 10.1038/nrclinonc.2010.61.
- Guo Y, Xiao P, Lei S, Deng F, Xiao GG, Liu Y, Chen X, Li L, Wu S, Chen Y, et al: How is mRNA expression predictive for protein expression? A correlation study on human circulating monocytes. Acta Biochim Biophys Sin (Shanghai). 2008, 40 (5): 426-436. 10.1111/j.1745-7270.2008.00418.x.View Article
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.