Question: How to write the output in snakefile (snakemake) for bowtie2-build
2
gravatar for cedrickclarido
3 days ago by
cedrickclarido20 wrote:

Hello community,

I am using snakemake to make a pipeline. I want to add the bowtie2-build from bowtie2 to my current snakefile as follow:

rule bowtie2Build:
    input: "refgenome/infected_consensus.fasta"
    output: "output/reference"
    shell: "bowtie2-build {input} {output}"

So I should be expecting the following files: reference.1.bt2 reference.2.bt2 reference.3.bt2 reference.4.bt2 reference.rev.1.bt2 reference.rev.2.bt2

But it seems the problem lies in the output. How can I write the output?

ADD COMMENTlink modified 2 days ago by Eric Lim980 • written 3 days ago by cedrickclarido20
5
gravatar for IP
3 days ago by
IP430
Denmark/University of Copenagen
IP430 wrote:

You wrote this:

rule bowtie2Build:
    input: "refgenome/infected_consensus.fasta"
    output: "output/reference"
    shell: "bowtie2-build {input} {output}"

What snakemake is going to do is to check if the file output ( in this case "output/reference" ) exist after executing the rule. Which doesn't since it is the basename for bowtie

What you can do is to pass the basename for the index as a parameter to snakemake function. Something like the following:

 rule bowtie2Build:
    input: "refgenome/infected_consensus.fasta"
    params:
        basename="output/reference"
   output:
        output1="output/reference.1.bt2",
        output2="output/reference.2.bt2",
        output3="output/reference.3.bt2",
        output4="output/reference.4.bt2",
        outputrev1="output/reference.rev1.bt2",
        outputrev2="output/reference.rev2.bt2"
    shell: "bowtie2-build {input} {params.basename}"
ADD COMMENTlink modified 2 days ago by Devon Ryan85k • written 3 days ago by IP430

Hello,

Using the following code:

rule bowtie2Build:
    input: "/home/bnextgen/refgenome/infected_consensus.fasta"
    params:
        basename: "/home/s1104230/output/reference"
    output:
        output1="output/reference.1.bt2"
        output2="output/reference.2.bt2"
        output3="output/reference.3.bt2"
        output4="output/reference.4.bt2"
        outputrev1="output/reference.rev1.bt2"
        outputrev2="output/reference.rev2.bt2"
    shell: "bowtie2-build {input} {params.basename}"

I get :

SyntaxError in line 31 of /home/s1104230/scripts/Snakefile:
Unexpected keyword output1 in rule definition (Snakefile, line 31)
ADD REPLYlink modified 2 days ago • written 3 days ago by cedrickclarido20
1

Hello,

after each defined output there must be a ,.

rule bowtie2Build:
    input: "/home/bnextgen/refgenome/infected_consensus.fasta"
    params:
        basename: "/home/s1104230/output/reference"
    output:
        output1="output/reference.1.bt2",
        output2="output/reference.2.bt2",
        output3="output/reference.3.bt2",
        output4="output/reference.4.bt2",
        outputrev1="output/reference.rev1.bt2",
        outputrev2="output/reference.rev2.bt2"
    shell: "bowtie2-build {input} {params.basename}"

fin swimmer

ADD REPLYlink written 2 days ago by finswimmer6.0k

I've updated this in the original (fixed the colon after params too). Good catch.

ADD REPLYlink written 2 days ago by Devon Ryan85k

Hello So I tried again the suggested code but it seems that the error occurs at the rev. files:

MissingOutputException in line 26 of /home/s1104230/scripts/Snakefile:
Missing files after 5 seconds:
/home/s1104230/mapping2/reference.rev2.bt2
/home/s1104230/mapping2/reference.rev1.bt2
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
ADD REPLYlink written 23 hours ago by cedrickclarido20

See if those files show up eventually. We usually use --latency-wait 300 do get around this.

ADD REPLYlink written 32 minutes ago by Devon Ryan85k

thanks for the corrections :)

ADD REPLYlink written 22 hours ago by IP430
0
gravatar for dariober
2 days ago by
dariober9.5k
Glasgow - UK
dariober9.5k wrote:

IP's answer is what I would do. An alternative is to use the touch function to create a marker file that signals that a task has successfully completed:

rule bowtie2Build:
    input: "/home/bnextgen/refgenome/infected_consensus.fasta"
    params:
        basename: "/home/s1104230/output/reference"
    output:
        touch('index.done'),
    shell: "bowtie2-build {input} {params.basename}"

Then, rules that depend on the bowtie index will take in input the file index.done.

ADD COMMENTlink written 2 days ago by dariober9.5k

Hello, I tried your suggestion But I get the following error:

Error in rule bowtie2Build: jobid: 0 output: index.done

RuleException:
CalledProcessError in line 49 of /home/s1104230/scripts/Snakefile:
Command ' set -euo pipefail;  bowtie2-build  /home/s1104230/mapping2/reference ' returned non-zero exit status 1
  File "/home/s1104230/scripts/Snakefile", line 49, in __rule_bowtie2Build
  File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
ADD REPLYlink written 23 hours ago by cedrickclarido20

The shell command is executed but it is throwing an error. In fact, bowtie2-build should have two positional arguments but there's only one:

bowtie2-build  /home/s1104230/mapping2/reference

It should be something like:

bowtie2-build /home/bnextgen/refgenome/infected_consensus.fasta /home/s1104230/mapping2/reference
ADD REPLYlink written 27 minutes ago by dariober9.5k
0
gravatar for Eric Lim
2 days ago by
Eric Lim980
Boston
Eric Lim980 wrote:

IP's solution is best as expected outputs are listed explicitly which would take advantage of timestamp.

Since v5.2.0, you can also accomplish this by using directory, which shorten the code by quite a bit while maintaining timestamp, but snakemake won't check if all expected outputs exist.

rule bowtie_build:
  input: 'chrM.fa'
  output: directory('reference/bowtie/')
  shell: 'bowtie2-build {input} {output}'

In this particular example, you'd need to ls -a to see the outputs.

https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#directories-as-outputs

directory has been super helpful in eliminating a bunch of if-else on rules when we don't know what outputs to expect at code time.

ADD COMMENTlink written 2 days ago by Eric Lim980
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1588 users visited in the last hour