Question: Seeking help with snakemake
1
gravatar for Ming
11 days ago by
Ming40
Ming40 wrote:

Dear All,

I am trying to run BBMap with snakemake, and I am pretty new to this.

# rule all: Specifies the files that you would like to create during your snakemake workflow
import os
import snakemake.io
import glob

(SAMPLES,READS,) = glob_wildcards("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz")
READS=["R1","R2"]

rule all:
  input: expand("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz",sample=SAMPLES, read=READS)

rule clumpify:
  input:
    r1="/home/tanshiming/Downloads/{sample}_R1_001.fastq.gz",
    r2="/home/tanshiming/Downloads/{sample}_R2_001.fastq.gz"

  output:
      o1="/home/tanshiming/Downloads/Clumpify/{sample}_R1.fastq.gz",
      o2="/home/tanshiming/Downloads/Clumpify/{sample}_R2.fastq.gz"

  shell:
    "clumpify.sh -Xmx50g in1={input.r1} in2=${input.r2} out1=Clumpify/{output.o1} out2=Clumpify/${output.o2} reorder ziplevel=9 dedupe=t optical=t"

When I try to run snakemake, I got the following error:

(bbmap) tanshiming@S620100019205:~/Scripts$ snakemake -n
SyntaxError in line 15 of /home/tanshiming/Scripts/Snakefile:
invalid syntax

This is the code that I will like to run:

Remove duplicates

for x in *_R1_001.fastq.gz
    do clumpify.sh -Xmx250g in1=$x in2=${x%_R1_001*}_R2_001.fastq.gz out1=Clumpify/$x out2=Clumpify/${x%_R1_001*}_R2_001.fastq.gz reorder ziplevel=9 dedupe=t optical=t
done

Appreciate any advice that I can get!

Thank you.

snakemake • 119 views
ADD COMMENTlink modified 11 days ago • written 11 days ago by Ming40
4
gravatar for gb
11 days ago by
gb1.2k
gb1.2k wrote:

You are missing comma's after the input and output files.

ADD COMMENTlink written 11 days ago by gb1.2k

Thanks @gb, but I am getting the following error now:

snakemake -n
SyntaxError in line 22 of /home/tanshiming/Scripts/Snakefile:
invalid syntax

This is at the clumpify.sh line.

ADD REPLYlink written 11 days ago by Ming40
2

You are missing "" around the command....

https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html

ADD REPLYlink written 11 days ago by gb1.2k

Dear @gb,

When I try to run snakemake, this script does not seem to run:

(bbmap) tanshiming@S620100019205:~/Scripts$ snakemake
Building DAG of jobs...
Nothing to be done.
Complete log: /home/tanshiming/Scripts/.snakemake/log/2019-11-08T154720.993886.snakemake.log

But I do see that the job has not run.......

ADD REPLYlink modified 11 days ago • written 11 days ago by Ming40
2

The input specification of rule all is exactly the files that you already have at the beginning. Therefore snakemake doesn't do anything: you already have what you need.

ADD REPLYlink written 11 days ago by WouterDeCoster42k

Dear WouterDeCoster,

Does that mean I have to delete the rule all to run the script?

ADD REPLYlink written 11 days ago by Ming40
2

I believe the rule all need to be the output of rule clumpify. Snakemake checks what files it needs to output (rule all). Next, it checks how it can get those files. So if you put the output files from rule clumpify in rule all there is a "connection".

Snakemake checks the output files in rule all, they are not there yet. It check how it can get them, it sees that if he execute rule clumpify he gets the output he needs. So he will first execute that rule before he can finish.

ADD REPLYlink modified 11 days ago • written 11 days ago by gb1.2k

Have you tried following the tutorial?

The input of rule all should be the file you aim to obtain out of this workflow. It should be the final output file.

ADD REPLYlink written 11 days ago by WouterDeCoster42k

This worked for me!

# rule all: Specifies the files that you would like to create during your snakemake workflow

    import os
    import snakemake.io
    import glob

    (SAMPLES,READS,) = glob_wildcards("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz")
    READS=["R1","R2"]

    rule all:
      input:
        expand("/home/tanshiming/Downloads/Clumpify/{sample}_{read}.fastq.gz",sample=SAMPLES, read=READS)

    rule clumpify:
      input:
        r1="/home/tanshiming/Downloads/{sample}_R1_001.fastq.gz",
        r2="/home/tanshiming/Downloads/{sample}_R2_001.fastq.gz"

      output:
          o1="/home/tanshiming/Downloads/Clumpify/{sample}_R1.fastq.gz",
          o2="/home/tanshiming/Downloads/Clumpify/{sample}_R2.fastq.gz"

      shell:
        "clumpify.sh -Xmx50g in1={input.r1} in2={input.r2} out1={output.o1} out2={output.o2} reorder ziplevel=9 dedupe=t optical=t"

Thank you very much for your help!

ADD REPLYlink modified 11 days ago • written 11 days ago by Ming40
2

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink written 11 days ago by WouterDeCoster42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1390 users visited in the last hour