FARO server: Meta-analysis of gene expression by matching gene expression signatures to a compendium of public gene expression data
© Nielsen et al; licensee BioMed Central Ltd. 2011
Received: 20 November 2010
Accepted: 11 June 2011
Published: 11 June 2011
Although, systematic analysis of gene annotation is a powerful tool for interpreting gene expression data, it sometimes is blurred by incomplete gene annotation, missing expression response of key genes and secondary gene expression responses. These shortcomings may be partially circumvented by instead matching gene expression signatures to signatures of other experiments.
To facilitate this we present the Functional Association Response by Overlap (FARO) server, that match input signatures to a compendium of 242 gene expression signatures, extracted from more than 1700 Arabidopsis microarray experiments.
Hereby we present a publicly available tool for robust characterization of Arabidopsis gene expression experiments which can point to similar experimental factors in other experiments. The server is available at http://www.cbs.dtu.dk/services/faro/.
Often gene expression studies identify more differentially expressed genes than can readily be functionally analyzed in follow up experiments. Fortunately, some of these genes typically are annotated either directly or through sequence similarity to other annotated genes, helping the scientist to interpret the observed transcriptional response. In many cases the transcripts can even be annotated with controlled vocabularies like the Gene Ontology  or Kyoto Encyclopedia of Genes and Genomes , facilitating systematic annotation analysis. Numerous successful examples of this type of analysis are found in the literature . However, this type of analysis depends on a high coverage of annotated genes that respond transcriptionally to stimulus. Alternatively, meta-analysis of gene expression data can identify experimental conditions that result in similar transcriptional responses. This type of analysis has been done in a series of organisms, for example in yeast , in Human cell lines  and in Arabidopsis thaliana . A key utilization of this type of analysis is in mutant, disease and drug characterization and matching.
Here we present a web-based implementation of the Functional Association by Response Overlap (FARO)  approach allowing comparison of a user provided gene expression signature, against a pre-compiled compendium. The approach matches the transcriptional response based on the identity and the response direction (over or under expressed) of the differentially expressed genes, ignoring the magnitude of the response. Previously, we demonstrated that this simplistic approach largely overcomes experimental biases and allows reliable comparison between experiments conducted under varying conditions in different laboratories and at the same time is simple enough to allow human interpretations of the results . The approach gains most of its robustness from avoiding direct comparison between the measurements in different experiments and instead comparing outcomes of comparisons between contrasts contained within a experimental design. Hence, the between experiment similarity measure is the number of intersecting genes between lists of differentially expressed genes from two experiments. In addition, congruence of the gene expression response direction adds important insight into the nature of the signature comparison.
Since the web server implementation of the FARO approach based on the script prepared and tested in the original experiment  testing was restricted to processing data submitted via the web site by a user. The server was tested with data files containing mixed Affymetrix ATH1 probe identifiers and AGI locus identifiers as well as unknown identifiers and empty lines. Tests proved that the implementation handles all mentioned cases correctly.
The FARO server allows the user to compare an expression signature against a compendium of signatures. The latter consists of 242 experimental signatures defined by the top 1209 differentially expressed genes. 1209 is the median number genes being significant across the compendium at significance level of 0.05. The experimental factors were extracted from more than 1700 public microarray experiments. The experimental factors represent various conditions and perturbations that are described in details on the server webpage. The server accepts a table containing at least 50 identifiers of either Affymetrix ATH1 probe set or AGI locus identifiers and compares the query gene list to the FARO compendium at the probe set level. For full functionality the input table must contain two columns containing identifiers and response direction as indicated by a signed number (possibly the log fold change), respectively. Optionally, the response direction may be indicated by "+" or -", or alternatively left out entirely. In the latter case the congruence analysis is omitted.
The comparison returns a list of associated experimental factors that are filtered according to user specified thresholds. Two options are available for setting the threshold:
1. The Overlap percentage threshold returns associated factors that have overlap with the query list that equals or exceeds the indicated percentage. Here the percentage means the percentage of the query length.
2. The Rank threshold returns the r factors with the strongest overlap to the query list. Where the user specifies the rank (r) threshold.
Availability and Requirements
Project name: FARO server
Project home page: http://www.cbs.dtu.dk/services/faro/
Operating system(s): Platform independent
Programming language: Perl
Other requirements: None
License: GNU GPL.
Any restrictions to use by non-academics: none
This work is supported by a grant from The Danish Agricultural and Veterinary Research Council (Multistress, SJVF)
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics. 2000, 25 (1): 25-29. 10.1038/75556.PubMedPubMed CentralView Article
- Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resource for deciphering the genome. Nucleic Acids Research. 2002, D277-280. 32 Database
- Rhee SY, Wood V, Dolinski K, Draghici S: Use and misuse of the gene ontology annotations. Nature Reviews 208. 7: 509-515. Genetics 9
- Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, Kidd MJ, King AM, Meyer MR, Slade D, Lum PY, Stepaniants SB, Shoemaker DD, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend SH: Functional discovery via a compendium of expression profiles. Cell. 2000, 102 (1): 109-126. 10.1016/S0092-8674(00)00015-5.PubMedView Article
- Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR: The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006, 313 (5795): 1929-1935. 10.1126/science.1132939. New York, N.YPubMedView Article
- Nielsen HB, Mundy J, Willenbrock H: Functional Associations by Response Overlap (FARO), a functional genomics approach matching gene expression phenotypes. PloS One 2. 2007, 7: e676-View Article
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.