snakemake workflow
2
0
Entering edit mode
4 months ago

Hello

I was trying to run the preprocessing pipeline

SRA,FRR = glob_wildcards("rawReads/{sra}_{frr}.fastq.gz")

rule all:

    input:
        expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR,extension=["zip","html"]),
        expand("multiqc_report.html"),
        expand("trimmedreads{sra}_fastq.html", sra=SRA),


rule rawFastqc:

    input:
        rawread="rawReads/{sra}_{frr}.fastq.gz",
    output:
        zip="rawQC/{sra}_{frr}_fastqc.zip",
        html="rawQC/{sra}_{frr}_fastqc.html",
    threads:
        1
    params:
        path="rawQC/",
    shell:
        """
        fastqc {input.rawread} --threads {threads} -o {params.path}
        """
rule multiqc:

   input:

        rawqc="rawQC",
   output:

       "multiqc_report.html"
   shell:

        """
        multiqc rawQC
        """

rule fastp:

     input:
         read1="rawReads/{sra}_1.fastq.gz",
         read2="rawReads/{sra}_2.fastq.gz",
     output:
         read1="trimmedreads/{sra}_1P.fastq.gz",
         read2="trimmedreads/{sra}_2P.fastq.gz",
         report_html= "trimmedreads{sra}_fastq.html",
     threads: 
        4
     shell:
         """
         fastp --thread {threads} -i {input.read1} -I {input.read2} -o {output.read1} -O {output.read2} -h {output.report_html}
         """

After running this pipeline I'm getting an error:

MissingOutputException in line 26 of /mnt/d/snakemake/f1.py:
Job Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
multiqc_report.html completed successfully, but some output files are missing. 3

which I'm thinking is due to missing output from rawfastqc because on running the multiqc rule separately I'm getting outputs but when I run it altogether it throws error

So I'm confused on why it is not running. Shall I do sequential running of my rules

If anybody can help me out

Thanks

pipeline snakemake • 549 views
ADD COMMENT
0
Entering edit mode
shell:
    "multiqc rawQC"

to

shell:
    "multiqc {input.rawqc}"

This is one thing I noticed.

ADD REPLY
0
Entering edit mode

Thanks for suggestions but it's still showing the same error

ADD REPLY
2
Entering edit mode
4 months ago
Shred ▴ 870

Multiple errors:

  • No need for expand in rule all for multiqc output.
  • Multiqc rule has no input. This will cause Snakemake to try to execute the MultiQC rule without the proper timing. Use html=expand("rawQC/{sra}_{frr}_fastqc.html", sra=SRA, frr=FRR) as input and the folder as a params.
  • Multiqc has a comma, but this'd to be a typing error

Remember that the input/output rule arguments must be thought just as determinants for workflow execution: you could use everything you want inside the shell command.

ADD COMMENT
0
Entering edit mode

thank you so much now i understood

ADD REPLY
2
Entering edit mode
4 months ago

This is a bit of a guess... The input to rule multiqc is just rawQC. Since rawQC is a directory, rule multiqc will be triggered even if rawQC is empty. In this case multiqc exits clean but produces no file hence the error you see. I would be more explicit and give to multiqc the actual files you want to combine. Something like:

rule multiqc:
   input:
        rawqc=expand("rawQC/{sra}_{frr}_fastqc.zip", sra=SRA, frr=FRR),
   output:
       "multiqc_report.html"
   shell:
        """
        multiqc {input.rawqc}
        """
ADD COMMENT
0
Entering edit mode

thank you so much it worked and i understood where i was going wrong

ADD REPLY

Login before adding your answer.

Traffic: 1001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6