SyntaxError (Perhaps you forgot a comma?) in snakefile when running FastQC.
2
0
Entering edit mode
10 months ago

I am a beginner to snakemake and I am trying to run fastqc on multiple paired-end reads. The format of the fastq.gz files are {sample}_L001_R{read}_001.fastq.gz where {sample} is the sample name and {read} is either 1 or 2. I want to combine the individual reports from fastqc into one report using multiqc. However, I keep running into this error that I am missing a comma in line 29. I am running this script using my university's high-performance computing cluster so have to load these modules before running the script: module load miniconda/4.12.0, module load snakemake/7.17.1 and conda activate snakemake-7.17.1. My snakefile is below. I could use all the help I can get!

# Create a list of strings containing all of our sample names
SAMPLES = ['DC10J_S4', 'DC11K_218_359_S6', 'DC12L_S5', 'DC1A_278_299_S1', 'DC2B_S1', 'DC3C_266_311_S2', 'DC5E_254_323_S3','DC6F_S2','DC7G_242_335_S4','DC8H_S3','DC9I_230_347_S5']
READS = ['1', '2']

rule all:
input:
    expand("/projectnb/altcells/ribosomal-profiling/data/{sample}_L001_R{read}_001.fastq.gz", sample=SAMPLES, read=READS)
    #"/projectnb/altcells/ribosomal-profiling/data/FastQC_output/multiqc_report.html" 

# run fastqc
# Rule to generate FastQC reports for each sample

rule fastqc:
input:
    fastq1 = "/projectnb/altcells/ribosomal-profiling/data/{sample}_L001_R{read}_001.fastq.gz",
    fastq2 = "/projectnb/altcells/ribosomal-profiling/data/{sample}_L001_R{read}_001.fastq.gz" # THIS IS LINE 29. MISSING A COMMA HERE?

output:
    fastqc_report1 = "/projectnb/altcells/ribosomal-profiling/data/FastQC_output/{sample}_L001_R1_001_fastqc.html",
    fastqc_report2 = "/projectnb/altcells/ribosomal-profiling/data/FastQC_output/{sample}_L001_R2_001_fastqc.html",
    fastqc_zip1 = "/projectnb/altcells/ribosomal-profiling/data/FastQC_output/{sample}_L001_R1_001_fastqc.zip",
    fastqc_zip2 = "/projectnb/altcells/ribosomal-profiling/data/FastQC_output/{sample}_L001_R2_001_fastqc.zip"

shell: """
    module load fastqc/0.11.7 
    fastqc {input.fastq1} -o /projectnb/altcells/ribosomal-profiling/data/FastQC_output &&
    fastqc {input.fastq2} -o /projectnb/altcells/ribosomal-profiling/data/FastQC_output 
    """

# Rule to aggregate all FastQC reports into a single HTML file
rule aggregate_fastqc:
input:
    expand("/projectnb/altcells/ribosomal-profiling/data/FastQC_output/{sample}_L001_R{read}_001_fastqc.html", sample=SAMPLES, read=READS)
output:
    "/projectnb/altcells/ribosomal-profiling/data/FastQC_output/multiqc_report.html"
shell: """
    module load multiqc &&
    multiqc /projectnb/altcells/ribosomal-profiling/data/FastQC_output -o /projectnb/altcells/ribosomal-profiling/data/FastQC_output 
    """
snakemake • 856 views
ADD COMMENT
0
Entering edit mode
10 months ago

fastq.gz".

what is the dot after fastq.gz" ?

ADD COMMENT
0
Entering edit mode

sorry that was a typo while i was copying my code! I still get the same error

ADD REPLY
0
Entering edit mode
10 months ago

There is no : after rule fastqc

ADD COMMENT
0
Entering edit mode

that didn't work either

ADD REPLY

Login before adding your answer.

Traffic: 1822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6