fastp snakemake
1
0
Entering edit mode
2.5 years ago

Hi all

I tried to make a snakemake pipeline for fastp but somehow I feel I'm going wrong or code is incorrect I'm posting my code so can you suggest where am I going wrong because I'm new to snakemake and I'm confused

This is my code

import os 

SRA,FRR=glob_wildcards("rawReads/{sra}_{frr}.fastq.gz")

rule all:

    input:
        expand ("rawQC/{sra}_{frr}_fastqc.{extension}",sra=SRA,frr=FRR,extension=["gz","html"])

rule rawFastqc:


    input:
        rawread="rawReads/{sra}_{frr}.fastq.gz"
    output:
        gz="rawQC/{sra}_{frr}_fastq.gz",
        html="rawQC/{sra}_{frr}_fastqc.html"
    threads:
        1
    params:
        path="rawQC/"
    shell:
        """
        fastqc {input.rawread} --threads {threads} -o {params.path}
        """
rule fastp:


     input:
         read1="rawReads/{sra}_1.fastq.gz",
         read2="rawReads/{sra}_2.fastq.gz"
     output:
         forwardpaired="trimmedreads/{sra}_1P.fastq.gz",
         reversepaired="trimmedreads/{sra}_2P.fastq.gz",
         report_html=trimmedreads{sra}_{frr}_fastq.html
     threads:
         4
     shell:
         """
         fastp -i {input[read1]} -I{input[read2]} -o{output[read1]} -O {output[read2]}
         """
snakemake fastp • 2.4k views
ADD COMMENT
1
Entering edit mode

in your rule fastp, how is snakemake supposed to match up {frr} to anything? need quotes there. also the input and output variables aren't indexed with [] but use . like you did in the rawFastqc

ADD REPLY
0
Entering edit mode

In addition, rule all has input for rule fastqc only, not for fastp rule. In fastp rule, output html is not quoted like other outputs. Threads in fastp rule are declared, but not used.

ADD REPLY
0
Entering edit mode

I feel I'm going wrong or code is incorrect

Does the snakemake interpreter agree with your feelings that the code is incorrect? What does it say exactly when you try to run a rule?

ADD REPLY
0
Entering edit mode

When I dry run the code I get this kind of message, but according to me it should show me my jobs that are running fastqc and fastp and that's the reason I feel something is not right

output screenshot

ADD REPLY
0
Entering edit mode
  1. Please show us everything including the command you're running.
  2. Please do not paste screenshots of plain text content, it is counterproductive. You can copy paste the content directly here (using the code formatting option shown below).

code_formatting

By the way, I've merged your comments into a single comment and cleaned up a little.

ADD REPLY
0
Entering edit mode

i will take care of it and thanks for the suggestions

ADD REPLY
0
Entering edit mode

can you post the command you are running? Since it is a dry-run there would not be any jobs (and ids) AFAIK. But if you want to check what programs are going to be executed with parameter, try adding -p to the dry-run code.

ADD REPLY
1
Entering edit mode
2.5 years ago

As explained in the comments, there are several issues with the code you posted. This is only a guess of what you are trying to do and hopefully it will move you forward:

SRA,FRR = glob_wildcards("rawReads/{sra}_{frr}.fastq.gz")

rule all:
    input:
        expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR, extension=["gz","html"]),
        expand("trimmedreads{sra}_fastq.html", sra=SRA),


rule rawFastqc:
    input:
        rawread="rawReads/{sra}_{frr}.fastq.gz",
    output:
        gz="rawQC/{sra}_{frr}_fastqc.gz",
        html="rawQC/{sra}_{frr}_fastqc.html",
    threads:
        1
    params:
        path="rawQC/",
    shell:
        """
        fastqc {input.rawread} --threads {threads} -o {params.path}
        """


rule fastp:
     input:
         read1="rawReads/{sra}_1.fastq.gz",
         read2="rawReads/{sra}_2.fastq.gz",
     output:
         read1="trimmedreads/{sra}_1P.fastq.gz",
         read2="trimmedreads/{sra}_2P.fastq.gz",
         report_html= "trimmedreads{sra}_fastq.html",
     threads: 4
     shell:
         """
         fastp --thread {threads} -i {input.read1} -I {input.read2} -o {output.read1} -O {output.read2} -h {output.report_html}
         """
ADD COMMENT
2
Entering edit mode

Can you please change

 fastp --thread {threads} -i {input[read1]} -I {input[read2]} -o {output[read1]} -O {output[read2]} -h {output[report_html]}

to

fastp --thread {threads} -i {input.read1} -I {input.read2} -o {output.read1} -O {output.read2} -h {output.report_html}

?

Please also remove extra , in rule all input second line and at the end of fastp rule output report_html.

ADD REPLY
1
Entering edit mode

cpad0112 Thanks for the suggestions. I agree that {input.read1} is better than {input[read1]} but I didn't want to edit the original code too much. In fact, there are a few other things I would change. Regarding the comma in rule all, I prefer to have all items in input, output, params terminated by comma (I vaguely remember a snakemake formatting guideline recommending it also) - again my code is not consistent for the same reason above.

ADD REPLY
0
Entering edit mode

Thanks for code comments and I agree with you on them @ dariober

ADD REPLY
0
Entering edit mode

heyy really a big thank you for this i understood where i m going wrong, this code ran but there is an issue in rule for fastqc i again posted about it though

thanks again

ADD REPLY

Login before adding your answer.

Traffic: 805 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6