Question: Seeking help with snakemake
1
gravatar for Ming
7 months ago by
Ming50
Ming50 wrote:

Dear All,

I am trying to run BBMap with snakemake, and I am pretty new to this.

# rule all: Specifies the files that you would like to create during your snakemake workflow
import os
import snakemake.io
import glob

(SAMPLES,READS,) = glob_wildcards("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz")
READS=["R1","R2"]

rule all:
  input: expand("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz",sample=SAMPLES, read=READS)

rule clumpify:
  input:
    r1="/home/tanshiming/Downloads/{sample}_R1_001.fastq.gz",
    r2="/home/tanshiming/Downloads/{sample}_R2_001.fastq.gz"

  output:
      o1="/home/tanshiming/Downloads/Clumpify/{sample}_R1.fastq.gz",
      o2="/home/tanshiming/Downloads/Clumpify/{sample}_R2.fastq.gz"

  shell:
    "clumpify.sh -Xmx50g in1={input.r1} in2=${input.r2} out1=Clumpify/{output.o1} out2=Clumpify/${output.o2} reorder ziplevel=9 dedupe=t optical=t"

When I try to run snakemake, I got the following error:

(bbmap) tanshiming@S620100019205:~/Scripts$ snakemake -n
SyntaxError in line 15 of /home/tanshiming/Scripts/Snakefile:
invalid syntax

This is the code that I will like to run:

Remove duplicates

for x in *_R1_001.fastq.gz
    do clumpify.sh -Xmx250g in1=$x in2=${x%_R1_001*}_R2_001.fastq.gz out1=Clumpify/$x out2=Clumpify/${x%_R1_001*}_R2_001.fastq.gz reorder ziplevel=9 dedupe=t optical=t
done

Appreciate any advice that I can get!

Thank you.

snakemake • 306 views
ADD COMMENTlink modified 7 months ago • written 7 months ago by Ming50
4
gravatar for gb
7 months ago by
gb1.8k
gb1.8k wrote:

You are missing comma's after the input and output files.

ADD COMMENTlink written 7 months ago by gb1.8k

Thanks @gb, but I am getting the following error now:

snakemake -n
SyntaxError in line 22 of /home/tanshiming/Scripts/Snakefile:
invalid syntax

This is at the clumpify.sh line.

ADD REPLYlink written 7 months ago by Ming50
2

You are missing "" around the command....

https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html

ADD REPLYlink written 7 months ago by gb1.8k

Dear @gb,

When I try to run snakemake, this script does not seem to run:

(bbmap) tanshiming@S620100019205:~/Scripts$ snakemake
Building DAG of jobs...
Nothing to be done.
Complete log: /home/tanshiming/Scripts/.snakemake/log/2019-11-08T154720.993886.snakemake.log

But I do see that the job has not run.......

ADD REPLYlink modified 7 months ago • written 7 months ago by Ming50
2

The input specification of rule all is exactly the files that you already have at the beginning. Therefore snakemake doesn't do anything: you already have what you need.

ADD REPLYlink written 7 months ago by WouterDeCoster43k

Dear WouterDeCoster,

Does that mean I have to delete the rule all to run the script?

ADD REPLYlink written 7 months ago by Ming50
2

I believe the rule all need to be the output of rule clumpify. Snakemake checks what files it needs to output (rule all). Next, it checks how it can get those files. So if you put the output files from rule clumpify in rule all there is a "connection".

Snakemake checks the output files in rule all, they are not there yet. It check how it can get them, it sees that if he execute rule clumpify he gets the output he needs. So he will first execute that rule before he can finish.

ADD REPLYlink modified 7 months ago • written 7 months ago by gb1.8k

Have you tried following the tutorial?

The input of rule all should be the file you aim to obtain out of this workflow. It should be the final output file.

ADD REPLYlink written 7 months ago by WouterDeCoster43k

This worked for me!

# rule all: Specifies the files that you would like to create during your snakemake workflow

    import os
    import snakemake.io
    import glob

    (SAMPLES,READS,) = glob_wildcards("/home/tanshiming/Downloads/{sample}_{read}_001.fastq.gz")
    READS=["R1","R2"]

    rule all:
      input:
        expand("/home/tanshiming/Downloads/Clumpify/{sample}_{read}.fastq.gz",sample=SAMPLES, read=READS)

    rule clumpify:
      input:
        r1="/home/tanshiming/Downloads/{sample}_R1_001.fastq.gz",
        r2="/home/tanshiming/Downloads/{sample}_R2_001.fastq.gz"

      output:
          o1="/home/tanshiming/Downloads/Clumpify/{sample}_R1.fastq.gz",
          o2="/home/tanshiming/Downloads/Clumpify/{sample}_R2.fastq.gz"

      shell:
        "clumpify.sh -Xmx50g in1={input.r1} in2={input.r2} out1={output.o1} out2={output.o2} reorder ziplevel=9 dedupe=t optical=t"

Thank you very much for your help!

ADD REPLYlink modified 7 months ago • written 7 months ago by Ming50
2

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink written 7 months ago by WouterDeCoster43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1503 users visited in the last hour