Question: How to avoid hardcoding in R? How to automatize R scripts?
1
gravatar for bioinfouser
10 days ago by
bioinfouser50
bioinfouser50 wrote:

Hello,

This is not directly related to any particular analysis. I have been doing my own analysis for rna seq data for the last one year, completely learning from here and there. Until now, I am confident and can do my own analysis. I always use Rstudio to write and run the scripts for my data analysis. But one thing I feel lacking is, not being able to make the r script automatized. Like, just to be able to run Rscript my_script.R in bash to execute and get the results. For every data sets, I have to manually change i.e. some variable names, or change some simple data table subsetting parameters or change the figure output names or change the working directory path etc. Etc. I am kind of frustrated and it drives the motivation down that everytime I have to make a copy of my old script, manually change stuffs inside, then run it. Is this always the case everywhere? I am sure it's not! The place where I come from, there's nobody to help me or teach me effective R programming to make it really automatized. I would really appreciate if you can give me some tips and tricks or some resources online to follow through to make Automatized scripts that will run seamlessly.

ADD COMMENTlink modified 9 days ago • written 10 days ago by bioinfouser50
3

In addition to the answer by Yean, you might consider looking into the optparse and argparse packages. These make handling command line arguments easier. I have a template R-script that uses optparse here: rscript-template gist

ADD REPLYlink modified 10 days ago • written 10 days ago by russhh5.3k
2

You can also add docopt to that list: https://github.com/docopt/docopt.R

ADD REPLYlink written 10 days ago by igor10k
6
gravatar for Yean
10 days ago by
Yean80
Bangkok
Yean80 wrote:

I don't know if my approach is the best one but I usually just use the argument in my script, and whenever I got new datasets in different path , new variables or whatever. I just simply change the path of file or variable via argument.

Something like

Rscript myscript.R arg1 arg2 arg3

Check this how to add argument into Rscript https://www.r-bloggers.com/passing-arguments-to-an-r-script-from-command-lines/

ADD COMMENTlink modified 10 days ago • written 10 days ago by Yean80
4

You can also use configuration files. Segregate the input variables into their own file named something like project1.config.R then simply read if from your script with source("project 1.config.R"). You can also write config files in various formats e.g. JSON, XML... and read them with the appropriate parsers. The advantage of config files is that you can save them with the output so that you can remember how a particular output was generated and avoid questions like: did I use arg1=0.1 with arg2="B" or was it arg1=0.2 with arg2="A"?

ADD REPLYlink written 10 days ago by Jean-Karim Heriche22k

Thank you very much! That's very helpful!

ADD REPLYlink written 10 days ago by bioinfouser50
3
gravatar for Friederike
10 days ago by
Friederike5.6k
United States
Friederike5.6k wrote:

The first step could be to write functions. This will help to identify the steps that can easily be automated and find the variables that may be a bit trickier to standardize. It will also reduce the amount of code that you have to wade through every time you do need to change something as the functions can easily be kept in a separate .R file, which you then source() in your main document.

If you need a primer on functions in R, this seems like a useful starting point.

ADD COMMENTlink written 10 days ago by Friederike5.6k

Thank you! I will look into it more!

ADD REPLYlink written 9 days ago by bioinfouser50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1437 users visited in the last hour