Error when using snakemake with RSEM
1
0
Entering edit mode
8 months ago
nattzy94 ▴ 20

I am trying to run RSEM on a number of samples using snakemake. However, when I run it on the cluster with snakemake --use-conda --cluster "qsub -pe smp 12" -j 10 -s ./rsem_snakefile, I get this error:

Activating conda environment: /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66
/bin/bash: /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file /Nathaniel/raw_data/rna_seq/m19-32h_1.fastq.gz /Nathaniel/raw_data/rna_seq/m19-32h_2.fastq.gz --star --star-path /home/STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/m19-32h/m19-32h > /Nathaniel/rsem_snakemake/m19-32h/m19-32h_rsem.log: No such file or directory

My rsem_snakefile is as follows:

import json
from os.path import join, basename, dirname

configfile: '/Nathaniel/snakemake-tutorial/config.yml'

OUT_DIR = config['OUT_DIR']
FILES = json.load(open(config['SAMPLES_JSON']))
SAMPLES = sorted(FILES.keys())

rule all:
        input:
                [OUT_DIR + "/" + x for x in expand('{sample}/{sample}.transcript.bam', sample = SAMPLES)]

rule rsem:
        input:
                r1 = lambda wildcards: FILES[wildcards.sample]['R1'],
                r2 = lambda wildcards: FILES[wildcards.sample]['R2']
        output:
                join(OUT_DIR,'{sample}','{sample}.transcript.bam')
        conda:
                '/Nathaniel/rsem_snakemake/rsem.yaml'
        log:
                '/Nathaniel/rsem_snakemake/{sample}/{sample}_rsem.log'
        shell:
                """
                "/Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file {input.r1} {input.r2} --star --star-path /STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/{wildcards.sample}/{wildcards.sample} > {log}"
                """

config.yaml looks like this:

OUT_DIR: '/Nathaniel/rsem_snakemake'
SAMPLES_JSON: '/Nathaniel/snakemake-tutorial/samples.json'

First few lines of samples.json:

{
    "16hr-33": {
        "R1": [
            "/Nathaniel/raw_data/rna_seq/16hr-33_1.fastq.gz"
        ],
        "R2": [
            "/Nathaniel/raw_data/rna_seq/16hr-33_2.fastq.gz"
        ]
    },
    "32hr-25": {
        "R1": [
            "/Nathaniel/raw_data/rna_seq/32hr-25_1.fastq.gz"
        ],
        "R2": [
            "/Nathaniel/raw_data/rna_seq/32hr-25_2.fastq.gz"
        ]
    },

Could this be a problem where the cluster is unable to activate the Conda environment? I can't figure this out.

RNA-Seq snakemake • 342 views
ADD COMMENT
0
Entering edit mode
8 months ago
nattzy94 ▴ 20

Figured out what was wrong, the shell command in rsem snakefile should not have quotes around them. It should read:

shell:
                """
                /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file {input.r1} {input.r2} --star --star-path /STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/{wildcards.sample}/{wildcards.sample} > {log}
                """
ADD COMMENT

Login before adding your answer.

Traffic: 1254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6