Question

How can I biuld to pipeline in nextflow to download SRA files?

0

Entering edit mode

21 months ago

pavelasquezv ▴ 50

I am trying to build a pipeline for RNAseq data analysis in nextflow. Please, I have not been able to get the first step which is to download the SRA files. Please do you have any suggestions for the following error?

This is the code:

#!/usr/bin/env nextflow

srr_ch = Channel.from( lista.txt)
srr_ch.println()


 process fastqdump {
   container 'quay.io/biocontainers/parallel-fastq-dump:0.6.5--py_0'

   input:
   each srr from srr_ch

   output:
   file("*.fastq") into fastq_ch

   """
   prefetch ${srr} && parallel-fastq-dump -t 8 -s ${srr}
   """
   }

This is the error:

N E X T F L O W  ~  version 22.06.1-edge
Launching `null` [elated_hodgkin] DSL1 - revision: c897000175
No such variable: lista

 -- Check script 'script1.nf' at line: 3 or see '.nextflow.log' file for more details

nextflow • 1.6k views

ADD COMMENT • link 21 months ago by pavelasquezv ▴ 50

2

Entering edit mode

NextFlow people have already done this for you: https://nf-co.re/fetchngs

ADD REPLY • link 21 months ago by GenoMax 141k

0

Entering edit mode

Thanks a lot GenoMax!

ADD REPLY • link 21 months ago by pavelasquezv ▴ 50

score 1 · Answer 1 · 2022-07-13

1

Entering edit mode

21 months ago

dariober 14k

Granted I know nothing about nextflow (snakemake guy here!)... The error says:

No such variable: lista

Perhaps srr_ch = Channel.from( lista.txt) should have quotes like srr_ch = Channel.from('lista.txt')?

Also, is prefetch ${srr} actually needed? I think parallel-fastq-dump -t 8 -s ${srr} should suffice to download fastq files.

A comment to GenoMax's answer: I have a mixed feeling about using wrappers just to execute a single command like parallel-fastq-dump - I think it makes things more complex for little merit...?

ADD COMMENT • link 21 months ago by dariober 14k

1

Entering edit mode

Reference where NF was recommended to OP: How to perform quality control on multiple folders using trimmomatic? thousands of SRA samples involved according to that post. So this download would become part of a larger pipeline was the logic I think.

ADD REPLY • link 21 months ago by GenoMax 141k

1

Entering edit mode

GenoMax - thanks for replying. To clarify my comment, I fully support the idea of using nextflow/snakemake even for simple pipelines. My doubt is whether the pipeline should execute "parallel-fastq-dump", as in the OP's question, or a more sophisticated wrapper. In general, I'm leaning in favour of a straightforward call but I don't know the specific of the OP work.

ADD REPLY • link 21 months ago by dariober 14k

0

Entering edit mode

Thanks a lot GenoMax and Dariober. You guys are my inspiration! I will try this code. Thanks again! All the best!

ADD REPLY • link 21 months ago by pavelasquezv ▴ 50