- Short Report
- Open Access
Advanced spot quality analysis in two-colour microarray experiments
© Friederich et al; licensee BioMed Central Ltd. 2008
- Received: 20 June 2008
- Accepted: 17 September 2008
- Published: 17 September 2008
Image analysis of microarrays and, in particular, spot quantification and spot quality control, is one of the most important steps in statistical analysis of microarray data. Recent methods of spot quality control are still in early age of development, often leading to underestimation of true positive microarray features and, consequently, to loss of important biological information. Therefore, improving and standardizing the statistical approaches of spot quality control are essential to facilitate the overall analysis of microarray data and subsequent extraction of biological information.
We evaluated the performance of two image analysis packages MAIA and GenePix (GP) using two complementary experimental approaches with a focus on the statistical analysis of spot quality factors. First, we developed control microarrays with a priori known fluorescence ratios to verify the accuracy and precision of the ratio estimation of signal intensities. Next, we developed advanced semi-automatic protocols of spot quality evaluation in MAIA and GP and compared their performance with available facilities of spot quantitative filtering in GP. We evaluated these algorithms for standardised spot quality analysis in a whole-genome microarray experiment assessing well-characterised transcriptional modifications induced by the transcription regulator SNAI1. Using a set of RT-PCR or qRT-PCR validated microarray data, we found that the semi-automatic protocol of spot quality control we developed with MAIA allowed recovering approximately 13% more spots and 38% more differentially expressed genes (at FDR = 5%) than GP with default spot filtering conditions.
Careful control of spot quality characteristics with advanced spot quality evaluation can significantly increase the amount of confident and accurate data resulting in more meaningful biological conclusions.
- Gene Ontology
- Differentially Express
- Log2 Ratio
- Differentially Express Gene
- Spot Quality
Microarray technology allows gaining novel insights into different biological phenotypes by studying genome-wide differences in gene expression profiles [1, 2]. Many efforts have been made to standardize microarray data analysis pipelines [3, 4]. Several initiatives such as the MAQC project showed that standardising data analysis procedures improved performance of microarray platforms . A critical component of the microarray data analysis pipeline is image analysis. Any error made at this stage of the analysis may propagate throughout the pipeline invalidating final biological conclusions such as differential expression or gene network establishment. Among the various approaches aiming at improving microarray analysis, one of the most important and less formalized is the evaluation of the quality of spots obtained in microarray experiments . Too stringent spot quality requirements can result in filtering-out relevant spots and loss of useful biological information. Conversely, too flexible filtering conditions will conserve bad spots leading to wrong predictions. This situation is mainly observed when analysing weak or contaminated spots which yet might contain important biological information. Numerous studies are aimed at improving the control of microarray spot quality including spot quality assessment  and filtering , evaluation of normalisation procedures , missing values imputation , comparison of different spot quality-assessing algorithms . However, there is still a lack of consistent and standardized methodology for microarray image analysis using advanced algorithms for automated spot quality evaluation.
Several software tools, such as AMIA , Matarray , MASQOT-GUI , Tiger Spotfinder , MAIA , which are based on standardized, semi-automated strategies for microarray image analysis are currently available for academic users. GenePix Pro (Molecular Devices, Sunnyvale, CA, USA) is a representative commercial software that is routinely used. Among reported software [for a brief overview see Additional file 1] GenePix (GP) and MAIA have distinct advantages. GP provides automated and user-friendly tools for microarray gridding, feature alignments, data management, and graphical representation of the results. GP has also functionality for spot quality analysis, in particular, a filter system for flagging spots as "good" (Flag = 100) or "bad" (Flag = 0) based on a user-defined set of conditions for the GP parameters. However, this facility is not automated and spot qualification in GP is highly dependent on user decisions. We have chosen MAIA as a representative example of an automated spot quality treatment allowing to save spots containing useful biological information . MAIA implements a compact set of statistical algorithms for microarray image analysis, including algorithms for the spot quality analysis at the pixel level. MAIA assigns to each ratio estimate a quality score ranging from 0 to 1. This score is calculated from 10 main quality characteristics reflecting different spot properties within the microarray.
Here, we developed advanced spot quality evaluation methodologies for MAIA and GP. These approaches were evaluated experimentally and compared to the default parameter filtering settings provided in GP. The precision and accuracy of spot quantification procedures were verified using microarrays with a priori known ratios of Cy5 to Cy3 intensities and biological relevance was assessed by comparing differentially expressed genes and significantly over-represented gene ontology (GO) categories in a whole-genome transcriptomic microarray experiment. Our results show that advanced spot quality evaluation methodologies developed in MAIA give slightly more accurate and precise Log2 ratios of signal intensities allowing to recover more useful spots and differentially expressed genes when compared with the default spot filtering procedure in GP.
Methodology for spot quality evaluation
Semi-automatic pipeline in MAIA
Spot filtering in GP
The GP parameters and limits for different filtering conditions in GP.
% > B635+2SD
% > B532+2SD
Rgn Ratio (635/532)
Rgn R2 (635/532)
F635 Median – B635
F532 Median – B532
Microarray analysis pipeline
A brief overview of steps common to MAIA and GP of our microarray analysis pipeline (normalization, preprocessing, aso) are given below.
Experimental evaluation of standardized approaches for spot quality assessment using control microarrays with a priori known fluorescence ratios
Mean Log2ratio obtained from analysis of five control slides by MAIA and GP.
Calibrated Log2 ratio
Collectively, our data show that MAIA generally gives slightly more accurate and precise Log2 ratios. An additional processing step or a proper calibration of the data is therefore needed in GP to smooth out the differences in Log2 ratios between the programs. This should allow conserving more informative spots in the follow-up analysis.
Evaluation of the performances of the image analysis methodologies in a comparative gene expression study using whole-genome arrays
Although accuracy and precision of the ratio estimates are important for reliable follow-up analysis, the major problem in microarray studies comes from deficient spots that when being improperly treated, may obscure the final conclusions. However, while stringent filtering conditions allow eliminating such bad spots, they also might lead to the loss of good, informative spots.
Our artificial microarrays with the known ratios are of very good overall quality and therefore they are not appropriate to evaluate algorithms for quantitative characterization of various spot deficiencies or systematic distortions. To evaluate the developed filtering procedures, we used oligonucleotide microarrays measuring genes which are differentially expressed in human MCF-7 epithelial breast carcinoma cells after induction of the transcription regulator SNAI1 [see Additional file 2]. SNAI1 directly represses the expression of a set of genes triggering thereby a well-described transcriptomic program which leads to the transition of epithelial cells to a mesenchymal phenotype . Because functional categories of genes that are up- or down-regulated during this process are well-characterized , we considered this experimental model suitable to further evaluate the performance of the image analysis procedures. We used cells transfected with the human SNAI1-cDNA cloned in a tetOff conditional expression system. Expression profiles before and after SNAI1 induction (time points 0 and 96 hours, a sample at time point 0 was a reference) were analyzed using oligonucleotide two-color microarrays purchased from the "University Medical Center of Utrecht" (UMCU, The Netherlands) . The microarray images were analysed either by MAIA or GP as described in Figure 2. We applied the semi-automatic approach for the spot quality assessment in MAIA and four filtering conditions – standard, weak,medium, and strong (defined in the Table 1) – for the spot quality assessment in GP. For the analysis in GP, we arbitrarily selected one microarray out of the 9 in the series and performed automatic gridding and spot quantification procedures. Weak, medium and strong/stringent filtering conditions were defined as described in Additional file 1 by considering 1, 2 and 3 STD borders in the distributions of the GP parameters (see in Table 1). We used the default GP filtering parameters as standard.
p-values of selected GO categories resulted from the different conditions of analysis.
Regulation of cell cycle
Hormone receptor binding
Wnt receptor activity
Transcriptional repressor activity
Transcriptional activator activity
Vitamin D receptor binding
Regulation of cell growth
Comparison of MAIA and GP spot filtering approaches on a set of 24 selected genes confirmed by RT-PCR or qRT-PCR.
Krueppel-like factor 5
Tight junction protein ZO-3
Keratin, type I cytoskeletal 12
B-box and SPRY domain containing
Signal-transducing adaptor protein 2
Protein phosphatase 1 regulatory subunit 16A
Keratin, type I cytoskeletal 18
Tribbles homolog 3
Thioredoxin interacting protein
Homeobox protein MSX-1
GULP, engulfment adaptor PTB domain containing 1
Dual specificity protein phosphatase 2
DNA-binding protein inhibitor ID-3
Heparan sulfate 6-O-sulfotransferase 2
Transforming growth factor-beta-induced protein ig-h3 precursor
S 100A 10
Calpactin I light chain
Collagen-binding protein 2 precursor
Zinc finger protein SLUG
Collagen alpha 1(V) chain precursor
Altogether, our data indicate that MAIA is a robust microarray image analysis program allowing a more accurate spot quantification and an improved collection of significant and relevant DE genes compared to GP. When considering GO categories potentially playing an important role in SNAI1 activity and EMT process, statistically enriched categories obtained by a GoMiner analysis had slightly lower p-values with MAIA dataset than those obtained with GP. Due to a larger number of significant DE genes, MAIA ensures a net increase in enriched GO categories. This could be very helpful when looking for subtle contribution of some biological processes. More generally, this study showed that careful control of spot quality characteristics with advanced spot quality evaluation can significantly increase the amount of meaningful data yielding more confident and accurate biological conclusions.
This work was supported by the Luxembourg National Science Foundation, Project BIOSAN FNR/01/04/09, the Centre National de Recherche France, CNRS. Authors thanks Dr. André Mehlen and Mr. François Bernardin for their assistance in preparing microarrays. We thank the Institute for Genomic Research for providing the A. thaliana control spiking cRNA vector set.
- Hoheisel JD: Microarray technology: beyond transcript profiling and genotype analysis. Nat Rev Genet. 2006, 7 (3): 200-210. 10.1038/nrg1809.View ArticlePubMedGoogle Scholar
- Muller J, Mehlen A, Vetter G, Yatskou M, Muller A, Chalmel F, Poch O, Friederich E, Vallar L: Design and evaluation of Actichip, a thematic microarray for the study of the actin cytoskeleton. BMC genomics. 2007, 8: 294-10.1186/1471-2164-8-294.PubMed CentralView ArticlePubMedGoogle Scholar
- Pelizzola M, Pavelka N, Foti M, Ricciardi-Castagnoli P: AMDA: an R package for the automated microarray data analysis. BMC Bioinformatics. 2006, 7: 335-10.1186/1471-2105-7-335.PubMed CentralView ArticlePubMedGoogle Scholar
- Demeter J, Beauheim C, Gollub J, Hernandez-Boussard T, Jin H, Maier D, Matese JC, Nitzberg M, Wymore F, Zachariah ZK: The Stanford Microarray Database: implementation of new analysis tools and open source release of software. Nucleic Acids Res. 2007, D766-770. 10.1093/nar/gkl1019. 35 DatabaseGoogle Scholar
- Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, et al.: The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006, 24 (9): 1151-1161. 10.1038/nbt1239.View ArticlePubMedGoogle Scholar
- Novikov E, Barillot E: An algorithm for automatic evaluation of the spot quality in two-color DNA microarray experiments. BMC Bioinformatics. 2005, 6: 293-10.1186/1471-2105-6-293.PubMed CentralView ArticlePubMedGoogle Scholar
- Raffelsberger W, Dembele D, Neubauer MG, Gottardis MM, Gronemeyer H: Quality indicators increase the reliability of microarray data. Genomics. 2002, 80 (4): 385-394. 10.1006/geno.2002.6848.View ArticlePubMedGoogle Scholar
- Tran PH, Peiffer DA, Shin Y, Meek LM, Brody JP, Cho KW: Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. Nucleic Acids Res. 2002, 30 (12): e54-10.1093/nar/gnf053.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang D, Zhang CH, Soares MB, Huang J: Systematic approaches for incorporating control spots and data quality information to improve normalization of cDNA microarray data. J Biopharm Stat. 2007, 17 (3): 415-431. 10.1080/10543400701199544.View ArticlePubMedGoogle Scholar
- Johansson P, Hakkinen J: Improving missing value imputation of microarray data by using spot quality weights. BMC Bioinformatics. 2006, 7: 306-10.1186/1471-2105-7-306.PubMed CentralView ArticlePubMedGoogle Scholar
- Sauer U, Preininger C, Hany-Schmatzberger R: Quick and simple: quality control of microarray data. Bioinformatics. 2005, 21 (8): 1572-1578. 10.1093/bioinformatics/bti238.View ArticlePubMedGoogle Scholar
- White AM, Daly DS, Willse AR, Protic M, Chandler DP: Automated Microarray Image Analysis Toolbox for MATLAB. Bioinformatics. 2005, 21 (17): 3578-3579. 10.1093/bioinformatics/bti576.View ArticlePubMedGoogle Scholar
- Wang X, Ghosh S, Guo SW: Quantitative quality control in microarray image processing and data acquisition. Nucleic Acids Res. 2001, 29 (15): E75-75. 10.1093/nar/29.15.e75.PubMed CentralView ArticlePubMedGoogle Scholar
- Bylesjo M, Sjodin A, Eriksson D, Antti H, Moritz T, Jansson S, Trygg J: MASQOT-GUI: spot quality assessment for the two-channel microarray platform. Bioinformatics. 2006, 22 (20): 2554-2555. 10.1093/bioinformatics/btl434.View ArticlePubMedGoogle Scholar
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, et al.: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34 (2): 374-378.PubMedGoogle Scholar
- Novikov E, Barillot E: Software package for automatic microarray image analysis (MAIA). Bioinformatics. 2007, 23 (5): 639-640. 10.1093/bioinformatics/btl644.View ArticlePubMedGoogle Scholar
- VAxon Instruments Inc, erdnik D: Guide to microarray analysis. Application Note. 2004, Axon Instruments IncGoogle Scholar
- Wang HY, Malek RL, Kwitek AE, Greene AS, Luu TV, Behbahani B, Frank B, Quackenbush J, Lee NH: Assessing unmodified 70-mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol. 2003, 4 (1): R5-10.1186/gb-2003-4-1-r5.PubMed CentralView ArticlePubMedGoogle Scholar
- Jia H, Lu L, Hng SC, Li J: Simulation study of ratio calculation formulae of two-colour cDNA microarray data. Appl Bioinformatics. 2006, 5 (4): 255-266. 10.2165/00822942-200605040-00008.View ArticlePubMedGoogle Scholar
- Novikov E, Barillot E: A robust algorithm for ratio estimation in two-color microarray experiments. J Bioinform Comput Biol. 2005, 3 (6): 1411-1428. 10.1142/S0219720005001624.View ArticlePubMedGoogle Scholar
- Cano A, Perez-Moreno MA, Rodrigo I, Locascio A, Blanco MJ, del Barrio MG, Portillo F, Nieto MA: The transcription factor snail controls epithelial-mesenchymal transitions by repressing E-cadherin expression. Nat Cell Biol. 2000, 2 (2): 76-83. 10.1038/35000025.View ArticlePubMedGoogle Scholar
- Peinado H, Olmeda D, Cano A: Snail, Zeb and bHLH factors in tumour progression: an alliance against the epithelial phenotype?. Nat Rev Cancer. 2007, 7 (6): 415-428. 10.1038/nrc2131.View ArticlePubMedGoogle Scholar
- Roepman P, Wessels LF, Kettelarij N, Kemmeren P, Miles AJ, Lijnzaad P, Tilanus MG, Koole R, Hordijk GJ, Vliet van der PC, et al.: An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas. Nature genetics. 2005, 37 (2): 182-186. 10.1038/ng1502.View ArticlePubMedGoogle Scholar
- Chu G, Narasimhan B, Tibshirani R, Tusher V: Significance analysis of microarrays (sam) software. The Internet. 2003, [http://www-stat.stanford.edu/~tibs/SAM/]Google Scholar
- Zeeberg BR, Qin H, Narasimhan S, Sunshine M, Cao H, Kane DW, Reimers M, Stephens RM, Bryant D, Burt SK, et al.: High-Throughput GoMiner, an 'industrial-strength' integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of Common Variable Immune Deficiency (CVID). BMC Bioinformatics. 2005, 6: 168-10.1186/1471-2105-6-168.PubMed CentralView ArticlePubMedGoogle Scholar
- Vega S, Morales AV, Ocana OH, Valdes F, Fabregat I, Nieto MA: Snail blocks the cell cycle and confers resistance to cell death. Genes Dev. 2004, 18 (10): 1131-1143. 10.1101/gad.294104.PubMed CentralView ArticlePubMedGoogle Scholar
- Yook JI, Li XY, Ota I, Fearon ER, Weiss SJ: Wnt-dependent regulation of the E-cadherin repressor snail. J Biol Chem. 2005, 280 (12): 11740-11748. 10.1074/jbc.M413878200.View ArticlePubMedGoogle Scholar
- Palmer HG, Larriba MJ, Garcia JM, Ordonez-Moran P, Pena C, Peiro S, Puig I, Rodriguez R, de la Fuente R, Bernad A, et al.: The transcription factor SNAIL represses vitamin D receptor expression and responsiveness in human colon cancer. Nat Med. 2004, 10 (9): 917-919. 10.1038/nm1095.View ArticlePubMedGoogle Scholar
- De Craene B, Gilbert B, Stove C, Bruyneel E, van Roy F, Berx G: The transcription factor snail induces tumor cell invasion through modulation of the epithelial cell differentiation program. Cancer Res. 2005, 65 (14): 6237-6244. 10.1158/0008-5472.CAN-04-3545.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.