RankProdIt is a web interface developed in haXe [9] that calls Perl CGI scripts to upload the data file, generate the R [5, 6] commands and execute R on the server in slave mode. For the Rank Products and Rank Sum analysis all user selected parameters are passed to the R package RankProd [7]. Note, that the default 100 permutations of RankProd, for calculating the probability of observing a Rank Product and/or Rank Sum by chance, for both Rank Products and Rank Sum analysis is retained in RankProdIt.
RankProdIt is a generic tool, able to accept any data set containing replicated samples for at least two conditions. Thus, whilst this manuscript documents RankProdIt for microarray data analysis, it can be applied to other high throughput data sets such as next-generation sequencing, proteomic and metabolomic data. Input measurements can either be in the form of absolute levels, where row-element k has measurements in multiple columns for each condition i and j, such as that obtained from single-colour microarray experiments, or in the form of ratios, where each column of row-element k is a ratio of conditions (i/j), as obtained from two-colour microarray experiments.
To process data using RankProdIt a user submits a tab-delimited text file that contains a row-identifier (typically gene/probe identifier) column and several columns containing data; missing data is represented by NA or NaN. The input file is not required to have columns in any particular order and columns containing data not to be used in the analysis can also be included. A header row does not need to be included but if so, there must only be one. An example input file is given in Additional file 1.
Once the input file is successfully checked and uploaded, for which there is constant progress feedback, a form containing a select box for each column in the file is produced; each select box denotes the classification of the column contents and how the column is to be handled in subsequent analysis. To aid the user RankProdIt attempts to predict the contents of each column and the initial selection of the select boxes reflects this. Still, the user can define how each column in the input file is to be handled (see Figure 1 for an example form given the input file in Additional file 1). Each column is readily identifiable through the column number (the order in which it appears in the input file) and associated information about that column (whether it contains text or numbers and the first element in the column) given in the form. A column can be selected to be either: a gene (row) identifier, ignored, a condition 1 or condition 2 sample (for absolute level based data), or a condition1/condition2 or condition2/condition1 sample (for ratio based data). For successful submission and correct execution of Rank Products or Rank Sum analysis a user must select only one column as a gene identifier and either:
or
If the correct selections are not made an error message is given following submission. Note, that whilst it is possible to perform Rank Products and/or Rank Sum analysis with as few as two biological replicates for each condition, it is recommended that a greater number of replicates be provided for greater confidence in data reliability.
The scale of the input data and the presence of a column header row is automatically selected by RankProdIt. Prior to submission the user can select whether to perform Rank Products or Rank Sum analysis; by default Rank Products analysis is selected.
Upon successful submission the data selected by the user is imported into R and Rank Products or Rank Sum analysis is performed by the RankProd package [7]; whilst the Rank Products/Sum analysis is being conducted an indication of processing is given, alerting the user that the analysis has not finished. If the data and selections made by the user do not cause an error within the RankProd [7] package a link to the output file is provided, for the user to download the results.
An example of an output (results) file is given in Additional file 2 and a brief description of columns within an output file is provided in Additional file 3. The output tab-delimited text file of RankProdIt is suitable to open with any spreadsheet software for data interpretation and/or further analysis (e.g. the enhanced distribution calculations for Rank Products that can easily be calculated in Excel [10]).