snakemake workflow
4 months ago

Hello

I was trying to run the preprocessing pipeline

SRA,FRR = glob_wildcards("rawReads/{sra}_{frr}.fastq.gz")

rule all:

input:
expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR,extension=["zip","html"]),
expand("multiqc_report.html"),

rule rawFastqc:

input:
output:
zip="rawQC/{sra}_{frr}_fastqc.zip",
html="rawQC/{sra}_{frr}_fastqc.html",
params:
path="rawQC/",
shell:
"""
"""
rule multiqc:

input:

rawqc="rawQC",
output:

"multiqc_report.html"
shell:

"""
multiqc rawQC
"""

rule fastp:

input:
output:
shell:
"""
"""


After running this pipeline I'm getting an error:

MissingOutputException in line 26 of /mnt/d/snakemake/f1.py:
Job Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
multiqc_report.html completed successfully, but some output files are missing. 3


which I'm thinking is due to missing output from rawfastqc because on running the multiqc rule separately I'm getting outputs but when I run it altogether it throws error

So I'm confused on why it is not running. Shall I do sequential running of my rules

If anybody can help me out

Thanks

shell:
"multiqc rawQC"


to

shell:
"multiqc {input.rawqc}"


This is one thing I noticed.

Thanks for suggestions but it's still showing the same error

4 months ago
Shred ▴ 870

Multiple errors:

• No need for expand in rule all for multiqc output.
• Multiqc rule has no input. This will cause Snakemake to try to execute the MultiQC rule without the proper timing. Use html=expand("rawQC/{sra}_{frr}_fastqc.html", sra=SRA, frr=FRR) as input and the folder as a params.
• Multiqc has a comma, but this'd to be a typing error

Remember that the input/output rule arguments must be thought just as determinants for workflow execution: you could use everything you want inside the shell command.

thank you so much now i understood

4 months ago

This is a bit of a guess... The input to rule multiqc is just rawQC. Since rawQC is a directory, rule multiqc will be triggered even if rawQC is empty. In this case multiqc exits clean but produces no file hence the error you see. I would be more explicit and give to multiqc the actual files you want to combine. Something like:

rule multiqc:
input:
rawqc=expand("rawQC/{sra}_{frr}_fastqc.zip", sra=SRA, frr=FRR),
output:
"multiqc_report.html"
shell:
"""
multiqc {input.rawqc}
"""

thank you so much it worked and i understood where i was going wrong