Question: Transform A Pipeline Of R Scripts Into A Web Application
20
gravatar for Giovanni M Dall'Olio
7.3 years ago by
London, UK
Giovanni M Dall'Olio25k wrote:

I have a pipeline of scripts that together produce different plots to describe some properties of genes involved in the same pathway.

For the moment, I execute them locally through a Makefile. The only thing that I have to do is to fill a list of genes in a file, and then call the main function, and it will automatically generate some graphs and tables on them.

I wonder, how can I translate this pipeline in a web application? I already have the scripts, I just need a way to create a web page where an user can upload a list of files and then download the results.

I have some experience with django and the earlier plone, but so much time has passed and I forgot how to use them... how would you implement it? By the way, is there any special framework for bioinformatics-related stuff? Or are there any special rules or standards that I should follow in order to integrate a web application with other bioinformatics-related services?

webservice web galaxy pipeline • 8.1k views
ADD COMMENTlink modified 2.7 years ago by YOT20 • written 7.3 years ago by Giovanni M Dall'Olio25k
19
gravatar for Brad Chapman
7.3 years ago by
Brad Chapman9.1k
Boston, MA
Brad Chapman9.1k wrote:

Galaxy is excellent for this type of script integration and workflow development:

http://galaxy.psu.edu/

It's written in Python and easy to get running:

http://bitbucket.org/galaxy/galaxy-central/wiki/GetGalaxy

You add in your custom scripts with a simple XML based language:

http://bitbucket.org/galaxy/galaxy-central/wiki/AddToolTutorial

ADD COMMENTlink written 7.3 years ago by Brad Chapman9.1k

that's how I managed all of my various tools. Now every script becomes a galaxy tool from the start. This makes it easier to join projects and share data. I also makes it easier to work with collaborators because I can just point them to a saved history on my own instance. I can also check the logs to see if they actually looked at it before complianing ;)

ADD REPLYlink written 7.3 years ago by Will4.4k

Thank you very much: in fact, I was looking for something in the style of galaxy. Let's see what other options come up with this thread.

ADD REPLYlink written 7.3 years ago by Giovanni M Dall'Olio25k
9
gravatar for Khader Shameer
7.3 years ago by
Manhattan, NY
Khader Shameer17k wrote:

This is very much possible. You need to write a wrapper script around the R scripts to run the pipeline and pass the results / plots to your HTML.

Read up the gene names from the web page to server-side using CGI

Write the names to a file in a tempdir of webserver

Write your R commands as a tempRscript.R

Run the command R--no-save < tempdir/tempRscript.R using ``(backticks) or system inside your server-side script

This will generate the results and the plot

Once this files are ready you may print to HTML using your server-side script.

You may also check RSPerl, Rcgi, RSPython, RPy or RSOAP. Not sure if they are active projects or not. As suggested by Pierre: using Pise/Mobyle will be an alternate option. I have tried to use Pise back in 2005, but the experience was not so smooth and finally I used `` and System commands to implement a webserver based on a C code.

ADD COMMENTlink modified 7.3 years ago • written 7.3 years ago by Khader Shameer17k
6

rpy2 is definitely alive (and kicking). I have made few instant web applications using Python frameworks with it.

ADD REPLYlink written 7.3 years ago by Laurent Gautier800
3

Thanks for also mentioning RSPerl. I have made a patched version of RSPerl which is available here: http://www.cebitec.uni-bielefeld.de/groups/brf/software/wiki/HowToInstallRSPerl . That will also install with the most recent R version (>2.11). I would not recommend the 'official' version though.

ADD REPLYlink written 7.3 years ago by Michael Dondrup43k
1

@Khader: which is precisely a known security issue.

See point 3 at Q6: I'm developing custom CGI scripts. What unsafe practices should I avoid? http://www.w3.org/Security/Faq/wwwsf4.html

ADD REPLYlink written 7.3 years ago by Laurent Gautier800

Excellent details Michael. Thanks for sharing this.

ADD REPLYlink written 7.3 years ago by Khader Shameer17k

thanks for the answer, however I would like to avoid calling system commands through the backticks, since it can cause very serious security problems when passing parameters to the script, and I am not expert enough of web security to know how to sanitize the inputs by myself. Moreover, I would have to design the web interface by myself, and I would prefer something in the style of plone or other cms, where the aspect of the web can be customized easily.

ADD REPLYlink written 7.3 years ago by Giovanni M Dall'Olio25k

AFAIK, using back tick is equivalent of using the system function of a language. If you are concerned about security issues you may use the Perl system function (system("command arg1 arg2 arg3");) equivalent to run your R scripts.

ADD REPLYlink written 7.3 years ago by Khader Shameer17k

Agreed. That's a classic example of how bad backticks can go. But there is always ways to get around with it. Also, need to take an extra bit of care when using sendmail. For example, you can have an email id checking function to look suspicious symbols like: < or /.

ADD REPLYlink written 7.3 years ago by Khader Shameer17k
7
gravatar for Pierre Lindenbaum
7.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum101k wrote:

see Pise ( http://www.ncbi.nlm.nih.gov/pubmed/11222264

... we have developed a Web interface generator for more than 150 molecular biology command-line driven programs(...). The generator uses XML as a high-level description language of the legacy software parameters. Its aim is to provide users with the equivalent of a basic Unix environment, with program combination, customization and basic scripting through macro registration

...and/or Mobyle https://projets.pasteur.fr/wiki/mobyle

ADD COMMENTlink written 7.3 years ago by Pierre Lindenbaum101k
1

It has been replaced by Mobyle as far as I know

ADD REPLYlink written 7.3 years ago by Pierre Lindenbaum101k

How does it deal with the input? For example, is it easy to hook up with a chemical editor, or something as simple as a text field for entering a sequence?

ADD REPLYlink written 7.3 years ago by Egon Willighagen5.1k

All I know is that the numerous interfaces (http://bioweb.pasteur.fr/intro-en.html) for the Pasteur Institute have been generated using PISE. Those interfaces have now moved to Mobyle.

ADD REPLYlink written 7.3 years ago by Pierre Lindenbaum101k

for example, using the internet archive, here is an old interface designed using PISE: http://web.archive.org/web/20030304132911/bioweb.pasteur.fr/seqanal/interfaces/msbar.html

ADD REPLYlink written 7.3 years ago by Pierre Lindenbaum101k

Thank you very much: however, the PISE publication is very old, dated 2001, and this is before a lot of very interesting web technologies were developed. Do you know if this tool is still under development and if it has been modernized with newer technologies?

ADD REPLYlink written 7.3 years ago by Giovanni M Dall'Olio25k
7
gravatar for Laurent Gautier
7.3 years ago by
Laurent Gautier800 wrote:

"Web application" covers a wide range of possibilities depending on the requirements. Your pipeline seems rather straightforward:

For the moment, I execute them locally through a Makefile. The only thing that I have to do is to fill a list of genes in a file, and then call the main function, and it will automatically generate some graphs and tables on them.

A minimal web framework of your choosing (may be handling sessions), with simple form to upload the list of genes, would already be ok. As you mention Django and Plone, I suppose that you are comfortable with Python. I have used bottle for prototyping, and get something up and running in no time.

If most of your code is in R + Python, rpy2 is definitely an option to consider. Setting up a minimal web applicattion can be done in an afternoon. I have slides around that theme presented at BOSC 2010.

ADD COMMENTlink modified 7.3 years ago • written 7.3 years ago by Laurent Gautier800

Thank you very much. I know that setting up a web application is easy, but I am concerned with the security problems that can show up, and I also want it to be easily integrated with other bioinformatics tools and with specifications if there are. In any case, I will look at bottle, thanks!

ADD REPLYlink written 7.3 years ago by Giovanni M Dall'Olio25k

What do you mean by "security"; robustness against piracy and other unwanted activities, or restrict access to content ?

My advice would be: look at your real requirements, and pick the lowest energy solution answer them. The simpler is what you have, the easier it is to control the "security". World domination plans can always come later.

ADD REPLYlink written 7.3 years ago by Laurent Gautier800
6
gravatar for Neilfws
7.3 years ago by
Neilfws47k
Sydney, Australia
Neilfws47k wrote:

You might want to look at RApache - "a project supporting web application development using the R statistical language and environment and the Apache web server".

It's possible to write web applications in pure R, using RApache and the brew package - see the documentation for examples.

It's also possible to integrate RApache code with other web frameworks. I wrote a basic introduction about communication between RApache and a Rails application.

ADD COMMENTlink written 7.3 years ago by Neilfws47k
2
gravatar for Cornel
7.3 years ago by
Cornel30
Cornel30 wrote:

I've used a job queuing system gearman.org) for running all sorts of "intensive" bioinformatics tool for this project: dnasubway.org

ADD COMMENTlink written 7.3 years ago by Cornel30
0
gravatar for YOT
2.7 years ago by
YOT20
Canada
YOT20 wrote:

I have worked in a project like that. I had a pipeline in bash. That should 4 python functions and at the end, the pipeline should call a Rscript. At the very end, R should create a tree of organisms and display a table whit some more informations.

I was invited to translate to a web app -> friendly user.

So the user only need to upload the file he wants analyse. 

1-jQuery to call your php script that will retirve file.

2- After upload is done -> call your first function.

3-Do the same untill your pipeline is fineshed.

4-You dont need to have Rapache. You can call your R instaled outside you server.

5-Set the Rscript to save the final result in a file.

6-Read this file using php,

Retrive the information and send to javascript.

Now you can do what you want. You can use d3 or canvas or SVG to create what you want.

I have use AJAX and returned JSON. Already ready to be used at d3.

Note: maybe you will need to use putenv to make R or your languages functions, to be recognized by php and use shell_exec to call eath function like this Qiime - pynest, call from shell_exec() in PHP not working

NOte2 to R. There are 2 options. You could call R script and pass values that comes from your pipeline process.

Or, as I did. I set R  to read the file that was created at the end of pipeline. Same result, but less issue.

ADD COMMENTlink written 2.7 years ago by YOT20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1156 users visited in the last hour