Question: snakemake and Star aligner code issue
1
gravatar for lawrence.hsu
8 months ago by
lawrence.hsu10
lawrence.hsu10 wrote:

I'm running into a nameerror where says 'sample' is unknown in this context. I'm thinking it has to do with the --outFileNamePrefix in the shell where it won't allow {sample}. I've tested it with coding a single sample to be processed and it works fine. I'm not sure how to proceed.

configfile: "config.yaml"
SAMPLES= ['A','B','C']

rule all:
    input:
        expand("01-qc/{sample}_fastqc.html", sample=config['samples']),
        expand('01-qc/{sample}_fastqc.zip', sample=config['samples']),
        expand('02-trimmed/{sample}.fastq_trimming_report.txt', sample=config['samples']),
        expand('02-trimmed/{sample}_trimmed.fq', sample=config['samples']),
        expand("03-qc/{sample}_trimmed_fastqc.html", sample=config['samples']),
        expand('03-qc/{sample}_trimmed_fastqc.zip', sample=config['samples']),
        expand('04-aligned-2pass/{sample}_ID/Aligned.out.sam', sample=SAMPLES),
        expand('04-aligned-2pass/{sample}_ID/Log.final.out', sample=SAMPLES)

rule fastQC01:
    input:
        sample=lambda wildcards: config['samples'][wildcards.sample]
    output:
        html="01-qc/fastqc/{sample}_fastqc.html",
        zip="01-qc/fastqc/{sample}_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
    params: ""
    log:
        "logs/fastqc/{sample}.log"
    shell:
        "fastqc {input} -o 01-qc/fastqc/"

rule trimgalore02:
    input:
        sample=lambda wildcards: config['samples'][wildcards.sample]
    output:
        report="02-trimmed/{sample}.fastq_trimming_report.txt",
        fq="02-trimmed/{sample}_trimmed.fq"
    log:
        "logs/trimgalore/{sample}.log"
    shell:
        "trim_galore --length 20 -q 20 {input} -o 02-trimmed/"

rule fastQC03:
    input:
        "02-trimmed/{sample}_trimmed.fq"
    output:
        html="03-qc/{sample}_trimmed_fastqc.html",
        zip="03-qc/{sample}_trimmed_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
    params: ""
    log:
        "logs/03-qc/{sample}.log"
    shell:
        "fastqc {input} -o 03-qc/"

rule STAR04:
    input:
        expand("02-trimmed/{sample}_trimmed.fq", sample=SAMPLES)
    output:
           sam=expand("04-aligned-2pass/{sample}_ID/Aligned.out.sam", sample=SAMPLES),
           sum=expand("04-aligned-2pass/{sample}_ID/Log.final.out", sample=SAMPLES)
    log:
        expand("logs/star/{sample}.log", sample=SAMPLES)
    shell:
        "/data/p_magnuson_lab/bin/STAR-2.6.0c/bin/Linux_x86_64/STAR "
        "--genomeDir /data/p_magnuson_lab/reference/mouse/STAR_index "
        "--runThreadN 8 "
        "--readFilesCommand zcat " 
        "--twopassMode Basic "
        "--alignEndsType EndToEnd "
        "--alignIntronMax 1 "
        "--readFilesIn {input} "
        "--outFileNamePrefix 04-aligned-2pass/{sample}_ID/"
snakemake • 485 views
ADD COMMENTlink modified 8 months ago by letlovelead13130 • written 8 months ago by lawrence.hsu10
2

You don't want expand in a wildcard rule - leave that for your targets. A wildcard rule is a recipe, not a manifest.

ADD REPLYlink written 8 months ago by Jeremy Leipzig19k

Have you tried running by changing {sample} to {wildcards.sample} in your shell commands?

ADD REPLYlink written 8 months ago by patelk26120

Yes I have tried that but it comes back with the error: 'Wildcards' object has no attribute 'sample'

ADD REPLYlink modified 8 months ago • written 8 months ago by lawrence.hsu10
2

You get the error because STAR04 doesn't have any wildcard. {sample} is already resolved by expand, hence wildcards.sample is not a valid attribute.

Perhaps this post will help. Use a workflow management tool to manage your computational pipelines . Your use of expand in both input and output in STAR04 doesn't make sense.

ADD REPLYlink modified 8 months ago • written 8 months ago by Eric Lim1.7k

That was my issue. Thanks!

ADD REPLYlink written 8 months ago by lawrence.hsu10

May be it should be {wildcards.samples} since you have SAMPLES= ['A','B','C'] at the beginning of your config file and you are trying to refer to that.

ADD REPLYlink written 8 months ago by patelk26120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 989 users visited in the last hour