Error when using snakemake with RSEM
1
0
Entering edit mode
3.6 years ago
nattzy94 ▴ 50

I am trying to run RSEM on a number of samples using snakemake. However, when I run it on the cluster with snakemake --use-conda --cluster "qsub -pe smp 12" -j 10 -s ./rsem_snakefile, I get this error:

Activating conda environment: /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66
/bin/bash: /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file /Nathaniel/raw_data/rna_seq/m19-32h_1.fastq.gz /Nathaniel/raw_data/rna_seq/m19-32h_2.fastq.gz --star --star-path /home/STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/m19-32h/m19-32h > /Nathaniel/rsem_snakemake/m19-32h/m19-32h_rsem.log: No such file or directory

My rsem_snakefile is as follows:

import json
from os.path import join, basename, dirname

configfile: '/Nathaniel/snakemake-tutorial/config.yml'

OUT_DIR = config['OUT_DIR']
FILES = json.load(open(config['SAMPLES_JSON']))
SAMPLES = sorted(FILES.keys())

rule all:
        input:
                [OUT_DIR + "/" + x for x in expand('{sample}/{sample}.transcript.bam', sample = SAMPLES)]

rule rsem:
        input:
                r1 = lambda wildcards: FILES[wildcards.sample]['R1'],
                r2 = lambda wildcards: FILES[wildcards.sample]['R2']
        output:
                join(OUT_DIR,'{sample}','{sample}.transcript.bam')
        conda:
                '/Nathaniel/rsem_snakemake/rsem.yaml'
        log:
                '/Nathaniel/rsem_snakemake/{sample}/{sample}_rsem.log'
        shell:
                """
                "/Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file {input.r1} {input.r2} --star --star-path /STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/{wildcards.sample}/{wildcards.sample} > {log}"
                """

config.yaml looks like this:

OUT_DIR: '/Nathaniel/rsem_snakemake'
SAMPLES_JSON: '/Nathaniel/snakemake-tutorial/samples.json'

First few lines of samples.json:

{
    "16hr-33": {
        "R1": [
            "/Nathaniel/raw_data/rna_seq/16hr-33_1.fastq.gz"
        ],
        "R2": [
            "/Nathaniel/raw_data/rna_seq/16hr-33_2.fastq.gz"
        ]
    },
    "32hr-25": {
        "R1": [
            "/Nathaniel/raw_data/rna_seq/32hr-25_1.fastq.gz"
        ],
        "R2": [
            "/Nathaniel/raw_data/rna_seq/32hr-25_2.fastq.gz"
        ]
    },

Could this be a problem where the cluster is unable to activate the Conda environment? I can't figure this out.

RNA-Seq snakemake • 1.2k views
ADD COMMENT
0
Entering edit mode
3.6 years ago
nattzy94 ▴ 50

Figured out what was wrong, the shell command in rsem snakefile should not have quotes around them. It should read:

shell:
                """
                /Nathaniel/snakemake-tutorial/.snakemake/conda/dcb8dd66/bin/rsem-calculate-expression -p 12 --paired-end --star-gzipped-read-file {input.r1} {input.r2} --star --star-path /STAR-2.7.3a/bin/Linux_x86_64_static /Nathaniel/customgtf_rnaseq/ensemblgtf_results/rsem/ensembl99 --output-genome-bam /Nathaniel/rsem_snakemake/{wildcards.sample}/{wildcards.sample} > {log}
                """
ADD COMMENT

Login before adding your answer.

Traffic: 2052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6