bowtie2 Mapping Using Pre-built Index in Nextflow
1
0
Entering edit mode
2.3 years ago
anelor ▴ 20

Grettings!

I am trying to get to run bowtie2 using Nextflow with a pre-built hg38 index.

My Nextflow script looks like this:

params.reads = "$baseDir/data/library.fq"
params.index = "$baseDir/data/index/GRCh38_noalt_as"
params.outdir = "results"

log.info """
         bowtie2 test
         ===================================
         index        : ${params.index}
         reads        : ${params.reads}
         outdir       : ${params.outdir}
         """
         .stripIndent()

process bowtie2 {
publishDir params.outdir, mode:'copy'

input:
path reads from params.reads
path index from params.index

output:
path 'mapping.sam' into bowtie2_ch

script:
"""
bowtie2 -x $index -U $reads -S mapping.sam 
"""

}

When I run this I get the following error:

Error executing process > 'bowtie2'

Caused by: Process bowtie2 terminated with an error exit status (255)

Command executed:

bowtie2 -x GRCh38_noalt_as -U library.fq -S mapping.sam

Command exit status: 255

Command output: (empty)

Command error: (ERR): "GRCh38_noalt_as" does not exist or is not a Bowtie 2 index Exiting now ...

When I run the command executed in the terminal the mapping works. So the index files do exist and their names are correct.

I tried several different approaches as creating a channel before the process for the index files and try to use it as an input but nothing worked so far.

Any help would be highly appreciated! :)

nextflow bowtie2 • 2.0k views
ADD COMMENT
3
Entering edit mode
2.3 years ago
ATpoint 89k

Two things:

First, you still use Nextflow DSL1 which is deprecated. Using DSL2 is highly recommended.

Second, the problem you see is that bowtie2 needs to basename of the index files, but nextflow cannot stage basename. What it can stage is folders or files. The trick is to stage the entire folder with the index files in it, and then use some bash to find the index files.

script:
"""
idx_base=\$(find ${idx}/ -name '*.bt2' | awk -F \".\" '{print \$1 | \"sort -u\"}')
bowtie2 -x \${idx_base} -U $reads -S mapping.sam 
"""

What this does is simply to search for the .bt2 files in the index folder, and extract the basename, and this is what bowtie2 needs as -x. Some $ and quotes are escaped in the script section as they're bash symbols, and not Nextflow ones, this is necessary for proper execution. I would also recommend to pipe to samtools view right away (bowtie2 (...) | samtools view -o out.bam) as sam files are uncompressed and just take away disk space.

ADD COMMENT
2
Entering edit mode

Thank you so much for your detailed answer! I will definitely switch to DSL2 when I am more comfortable with DSL1.

I added the bash script and this time I am getting a different error:

Command error:
  find: GRCh38_noalt_as/: No such file or directory
  (ERR): "-S" does not exist or is not a Bowtie 2 index
  Exiting now ...

I am a little bit tried fighting with this all day so maybe I am missing something.

Edit: After coming back to it from a break this actually worked! I just needed to change the params.index from "$baseDir/data/index/GRCh38_noalt_as" to "$baseDir/data/index/"

Thanks a lot!

ADD REPLY
1
Entering edit mode

Nextflow has advantages without questions, but it is neither easy nor simple, and sometimes a PITA, I definitely feel you. At some point you're comfortable enough and have some template workflows that you can recycle for daily use, but it's definitely a steep learning curve. At least it was for me.

ADD REPLY

Login before adding your answer.

Traffic: 3344 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6