Rule all has wrong order
0
1
Entering edit mode
8 weeks ago
Lisan ▴ 10

Hello all,

Im rather new to Snakemake using Python, im trying to make a pipeline but the Rule all from the main script seems to have the wrong order and i cant seem to change it no matter what i do. Can anybody show me what im doing wrong.

Here is my first script:

configfile: "/home/PycharmProjects/Pipeline/config.yaml"

rule first:
input:

rule prefetch:
output:
"prefetch_files/sra/{srr}.sra"
params:
"{srr} --max-size 250GB -O sra_files"
log:
"prefetch_files/sra/{srr}.log"
message:
shell:
"""
/Tools/sra_toolkit/sratoolkit.3.0.0-ubuntu64/bin/prefetch {params} > {log} 2>&1 && touch
{output}
"""


And within the same file is:

rule fastqdump:
input:
"prefetch_files/sra/{srr}.sra"
output:
touch("prefetch_files/done__{srr}_dump")
params:
args = "-S -O fastq_files/ -t fastq_files/ ",
id_srr = "{srr}"
log:
"prefetch_files/{srr}.log"
shell:
"""
/Tools/sra_toolkit/sratoolkit.3.0.0-ubuntu64/bin/fasterq-dump {params.args}                 {params.id_srr} > {log} 2>&1
"""


If i run this first manually nothing is wrong and it gets me all my files that i need for the following script (i think. i can be wrong here, but it gets me files)

Then in a second script i try the trimmomatic:

configfile: "/PycharmProjects/Pipeline/config.yaml"

rule now:
input:

rule trimmomatic:
input:
unused = "prefetch_files/done__{srr}_dump",
raw=config['FileDir']+"/{srr}.fastq",

output:

params:
jar=config["trimmomatic"]["jar"],
phred=config["trimmomatic"]["phred"],
minlen=config["trimmomatic"]["minlen"],
trailing=config["trimmomatic"]["trailing"],
slidwindow=config["trimmomatic"]["slidwindow"]

log:
"logs/trimmomatic/{srr}_trimmed.log"

shell:
"(java -jar {params.jar} SE {params.phred} {input.raw} {output} ILLUMINACLIP:
{log} 2>&1"


And my main.smk is this: configfile: "/PycharmProjects/Pipeline/config.yaml"

include: "download_sample.smk"
include: "trimming.smk"
include: "dagfile.smk"

rule all:
input:
expand("prefetch_files/done__{srr}_dump", srr=config['srr'])


And in case its important my config.yaml:

FileDir: "/PycharmProjects/Pipeline/Pipeline/workflow/Pre-processing/fastq_files"

srr:
- SRR5327856
- SRR5327984
- SRR5327985

trimmomatic:
jar: /Documents/Lisan/Tools/Trimmomatic-0.39/trimmomatic-0.39.jar
phred: -phred33
minlen: 45
trailing: 3
slidwindow: 4:15


I tried adding the output files from the prefetch to the trimmomatic input but this doesnt seem to help. Anytime i run the main it will start with the trimmomatic file and error since the files dont exist.

  (base) Workstation:~/PycharmProjects/Pipeline/Pipeline/workflow/Pre-processing\$
snakemake --snakefile main.smk -c4
Building DAG of jobs...
MissingInputException in rule trimmomatic in line 9 of
/PycharmProjects/Pipeline/Pipeline/workflow/Pre-processing/trimming.smk:
Missing input files for rule trimmomatic:
wildcards: srr=SRR5327856
affected files:
/PycharmProjects/Pipeline/Pipeline/workflow/Pre-processing/fastq_files/SRR5327856.fastq


I tried googling my error or my fault once i got stuck but i didnt really find anything which lead me to the possible conclusion that its probably something very simple that im not seeing. I dont have anybody around me who can help me with Python or Snakemake so i hope somebody here can help me. Thanks

Rules Snakemake Python • 257 views
1
Entering edit mode

The error indicates Missing input files for rule trimmomatic. So we start the debug from there. In your trimmomatic rule, it asks for input files that satisfy "config['FileDir']+"/{srr}.fastq", but none of your rules defines these files as outputs. Even though we know these fastqs are generated by fasterq-dump, snakemake doesn't. So you need to specifically define those files in your fastqdump` rule as outputs. Otherwise, snakemake doesn't know how to build the dag. Hope this helps.

Traffic: 2513 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.