Question: snakemake and Star aligner code issue
1
gravatar for lawrence.hsu
5 weeks ago by
lawrence.hsu10
lawrence.hsu10 wrote:

I'm running into a nameerror where says 'sample' is unknown in this context. I'm thinking it has to do with the --outFileNamePrefix in the shell where it won't allow {sample}. I've tested it with coding a single sample to be processed and it works fine. I'm not sure how to proceed.

configfile: "config.yaml"
SAMPLES= ['A','B','C']

rule all:
    input:
        expand("01-qc/{sample}_fastqc.html", sample=config['samples']),
        expand('01-qc/{sample}_fastqc.zip', sample=config['samples']),
        expand('02-trimmed/{sample}.fastq_trimming_report.txt', sample=config['samples']),
        expand('02-trimmed/{sample}_trimmed.fq', sample=config['samples']),
        expand("03-qc/{sample}_trimmed_fastqc.html", sample=config['samples']),
        expand('03-qc/{sample}_trimmed_fastqc.zip', sample=config['samples']),
        expand('04-aligned-2pass/{sample}_ID/Aligned.out.sam', sample=SAMPLES),
        expand('04-aligned-2pass/{sample}_ID/Log.final.out', sample=SAMPLES)

rule fastQC01:
    input:
        sample=lambda wildcards: config['samples'][wildcards.sample]
    output:
        html="01-qc/fastqc/{sample}_fastqc.html",
        zip="01-qc/fastqc/{sample}_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
    params: ""
    log:
        "logs/fastqc/{sample}.log"
    shell:
        "fastqc {input} -o 01-qc/fastqc/"

rule trimgalore02:
    input:
        sample=lambda wildcards: config['samples'][wildcards.sample]
    output:
        report="02-trimmed/{sample}.fastq_trimming_report.txt",
        fq="02-trimmed/{sample}_trimmed.fq"
    log:
        "logs/trimgalore/{sample}.log"
    shell:
        "trim_galore --length 20 -q 20 {input} -o 02-trimmed/"

rule fastQC03:
    input:
        "02-trimmed/{sample}_trimmed.fq"
    output:
        html="03-qc/{sample}_trimmed_fastqc.html",
        zip="03-qc/{sample}_trimmed_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
    params: ""
    log:
        "logs/03-qc/{sample}.log"
    shell:
        "fastqc {input} -o 03-qc/"

rule STAR04:
    input:
        expand("02-trimmed/{sample}_trimmed.fq", sample=SAMPLES)
    output:
           sam=expand("04-aligned-2pass/{sample}_ID/Aligned.out.sam", sample=SAMPLES),
           sum=expand("04-aligned-2pass/{sample}_ID/Log.final.out", sample=SAMPLES)
    log:
        expand("logs/star/{sample}.log", sample=SAMPLES)
    shell:
        "/data/p_magnuson_lab/bin/STAR-2.6.0c/bin/Linux_x86_64/STAR "
        "--genomeDir /data/p_magnuson_lab/reference/mouse/STAR_index "
        "--runThreadN 8 "
        "--readFilesCommand zcat " 
        "--twopassMode Basic "
        "--alignEndsType EndToEnd "
        "--alignIntronMax 1 "
        "--readFilesIn {input} "
        "--outFileNamePrefix 04-aligned-2pass/{sample}_ID/"
snakemake • 172 views
ADD COMMENTlink modified 5 weeks ago by letlovelead13130 • written 5 weeks ago by lawrence.hsu10
2

You don't want expand in a wildcard rule - leave that for your targets. A wildcard rule is a recipe, not a manifest.

ADD REPLYlink written 5 weeks ago by Jeremy Leipzig18k

Have you tried running by changing {sample} to {wildcards.sample} in your shell commands?

ADD REPLYlink written 5 weeks ago by patelk26100

Yes I have tried that but it comes back with the error: 'Wildcards' object has no attribute 'sample'

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by lawrence.hsu10
2

You get the error because STAR04 doesn't have any wildcard. {sample} is already resolved by expand, hence wildcards.sample is not a valid attribute.

Perhaps this post will help. Use a workflow management tool to manage your computational pipelines . Your use of expand in both input and output in STAR04 doesn't make sense.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Eric Lim1.4k

That was my issue. Thanks!

ADD REPLYlink written 4 weeks ago by lawrence.hsu10

May be it should be {wildcards.samples} since you have SAMPLES= ['A','B','C'] at the beginning of your config file and you are trying to refer to that.

ADD REPLYlink written 5 weeks ago by patelk26100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2325 users visited in the last hour