Render Rmarkdown in nextflow
2
0
Entering edit mode
6 months ago
ATpoint 55k

Can someone enlighten me, I cannot get my head around rendering Rmarkdown scripts via nextflow:

#! /usr/bin/env nextflow

nextflow.enable.dsl=2

process renderRMD {

input:
path(rmd)

output:
path("script.html")
path("out.txt")

script:
"""

## this works fine and gets emitted to the work dir
Rscript -e 'write.table(x=data.frame(A=1), file="out.txt")'

## this causes an error
script -e 'rmarkdown::render("${rmd}", output_file="script.html")' ## and also this: Rscript -e 'rmarkdown::render("${rmd}", output_file="$workDir/script.html")' """ } workflow { renderRMD( Channel.fromPath("${baseDir}/script.rmd") ) }


This always exits with error:

Caused by:
Missing output file(s) script.html expected by process renderRMD (1)


The out.txt file gets properly emitted to the work directory, but this Rmarkdown html causes an error. The html in fact is created to the same directory as the script.rmd file is located in, but for a reason I do not understand it does not get emitted to the work dir, hence nextflow "does not know about it" and this raises the error about the missing file.

Let script.rmd just be:

---
title: "nf-render"
output:
html_document: default
---

{r,eval=TRUE,echo=TRUE}

library(ggplot2)
ggplot(cars, aes(speed,dist)) + geom_point()




Any ideas?

rmarkdown nextflow • 642 views
0
Entering edit mode

do you get the file when running bash .command.sh by hand in the cache directory for this "process" ? any error message ?

0
Entering edit mode

Ah yes, forgot to add that: The html is being properly created, both running the nextflow command and the .command.sh itself (but with that error when running via nextflow), and only to baseDir where the Rmd script sits bit the work cache folder where I would like the html to be in is empty. The .command.sh runs without errors. I guess this is related to how the rendering engine outputs the html that I do not understand.

0
Entering edit mode

The html is being properly created,

so why do you get Missing output file(s) ?

0
Entering edit mode

That is the crux of that question :) The thing is that it is not created in the work cache dir, so nextflow does not "recognize" it as being created. If I add optional: true to the output: declaration it works fine, or just removing the output declaration at all works as well, but I still want this html report in the cache.

2
Entering edit mode
6 months ago

add something like ,output_dir=getwd(),... ?

1
Entering edit mode

Thanks, this works out now as expected. Got a similar suggestion over at nf-core Slack, sorry I just missed your comment.

(...)
script:
"""
Rscript -e 'rmarkdown::render("${rmd}", output_file="script.html", output_dir = getwd())' """  This then emits the report to the work directory as intended. Edit: I was told though that getwd() might not be the best choice though as it is not platform-agnostic in terms of resolving the file path, https://github.com/nf-core/rnaseq/pull/614 Will update the answer if I find a different solution, for now this works fine :) ADD REPLY 1 Entering edit mode 4 months ago Gregor Sturm ▴ 80 This is a known issue with symbolic links: https://github.com/rstudio/rmarkdown/issues/1508 You could either use stageInMode: 'copy', or manually copy the file to the work directory before executing the notebook. script: """ cp -L${rmd} notebook.Rmd
Rscript -e "rmarkdown::render('notebook.Rmd')"
"""

0
Entering edit mode

I use that way too, it avoid the potential getwd() problem