RShell is based on a client-server architecture. It requires a local or remote installation of the R-interpreter with the Rserve library [4] installed. The Rserve library turns the R-interpreter into a server, which enables other applications to communicate with the R-interpreter by means of a socket connection. From here on, the R-interpreter with the Rserve library installed will be denoted as the R server. RShell uses the Java library named REngine to establish and maintain connection between Taverna and R server. To execute R scripts, the RShell processor sends the script and the input data via the RShell session manager to the R server, which delegates the script to the R-interpreter and sends the results back to RShell (Figure 1). This new version is fully compatible up to and including R 2.9.
Configuring the RShell processor
The RShell processor is highly configurable: the user can define the script to be executed, the input ports to feed the relevant data, and the output ports to extract the created data. The user can configure the RShell processor by invocating the RShell processor in Taverna's "advanced model explorer". This will show a configuration dialog containing four tabs for the script, the input ports, the output ports and the connection settings.
Syntax highlighting is available in the scripting tab, to help the user write correct R code. The keywords are highlighted in blue; the inputs and outputs of the RShell are highlighted in pink.
RShell supports multiple input and output ports. Each port has a name, corresponding to the variable name it represents, as well as a type, corresponding to the type of data it holds. RShell uses the data type to determine how to exchange data with Taverna. RShell supports booleans, numeric values (both integers and floating point numbers), strings, and, vector of numerics and strings. Inputs and outputs of these data types can directly be used as variables in the R script. RShell provides the text-file data type to handle large data. Ports of this type can be read and written as normal R tables. The PNG and, since version 1.2, the PDF data type are available for graphical output. PNG can be used bitmap graphics; PDF for vector graphics. PNG and PDF outputs are handled as graphics devices in the standard R fashion. Input ports and output ports can be defined using the input ports and output ports tab.
RShell can execute any R script as long as the required libraries are installed in the R server. By default, the RShell processor is configured to use a locally installed R server serving at address localhost, port 6311. It can be configured to use a remote installation of R instead, using the connections tab. This can be useful when, the user is not able to install R, a central installation of R with a specific set of installed libraries is used, or R is installed in a grid environment. Multiple R servers can be accessed within the same workflow.
Persistent sessions
RShell supports persistent sessions to prevent unnecessary data transfer. The user can enable persistent sessions in the connection tab. When these are enabled, all input data, output data and script variables will be kept in memory of the R-interpreter until the whole workflow execution is finished. Multiple RShell processors in the same workflow are able to use the data provided by previously used processors, without requiring data links between these processors, as long as they use the same R server, Although persistent sessions can be very useful, they have some limitations: i) sessions can only be used among RShell processors, ii) RShell processors involved in a single persistent session have to access the same R server (but RShells devoted to a different session can access other R servers), and iii) it takes extra effort to keep a provenance log of data kept at the R interpreter. Taverna manages all data consumed and produced by processors. By using persistent sessions, Taverna is not aware of the data generated by one RShell processor and used by another RShell processor. If the user wants to record data generated by an RShell processor, he/she can do that by defining an output port for that variable.