Question: Snakemake wildcards: Not all output, log and benchmark files of rule sortBAM_bbb contain the same wildcards.
0
gravatar for whelenakanya
8 months ago by
whelenakanya0 wrote:
rule sortBAM_bbb:
    input:
        inBAM = '{PATH}/{sample}.bam'
    output:
        outBAM = '{PATH}/{sample}.biobambam.sorted.bam'
    params:
        sortby = config['sortby_bbb'],
    log:
        log_samtools = config['logs']['bamUtil'] + '{PATH}/sortBAM_bbb_{sample}.' + strftime("%Y-%m-%d.%H-%M-%S", localtime()) + '.samtools.log',
        log_bamsort = config['logs']['bamUtil'] + '{PATH}/sortBAM_bbb_{sample}.' + strftime("%Y-%m-%d.%H-%M-%S", localtime()) + '.bamsort.log'
    shell:
        'samtools view -b {input.inBAM} &> {log.log_samtools} | bamsort SO={params.sortby} tmpfile={output.outBAM}[:-4].tmp > {output.outBAM} &> {log.log_bamsort}'

I am wondering why this piece of code returns this error:

SyntaxError:
Not all output, log and benchmark files of rule sortBAM_bbb contain the same wildcards. This is crucial though, in order to avoid that two or more jobs write to the same file.

I tried the following code and it works, I just don't want the log files to be buried under many directories (the initial PATH wildcard is about 3-4 directories deep).

rule sortBAM_bbb:
    input:
        inBAM = '{sample}.bam'
    output:
        outBAM = '{sample}.biobambam.sorted.bam'
    params:
        sortby = config['sortby_bbb'],
    log:
        log_samtools = config['logs']['bamUtil'] + '/sortBAM_bbb/{sample}.' + strftime("%Y-%m-%d.%H-%M-%S", localtime())
+ '.samtools.log',
        log_bamsort = config['logs']['bamUtil'] + '/sortBAM_bbb/{sample}.' + strftime("%Y-%m-%d.%H-%M-%S", localtime())
+ '.bamsort.log'
    shell:
        'samtools view -b {input.inBAM} &> {log.log_samtools} | bamsort SO={params.sortby} tmpfile={output.outBAM}[:-4].tmp > {output.outBAM} &> {log.log_bamsort}'
ADD COMMENTlink modified 8 months ago by ale_abd30 • written 8 months ago by whelenakanya0
1
gravatar for ale_abd
8 months ago by
ale_abd30
ale_abd30 wrote:

I think your error is on how you are running snakemake (how are you calling the target rule). In my case, I reproduce your first rule and it works!

Snakefile:

rule sortBAM_bbb:
    input:
        inBAM = '{PATH}/{sample}.bam'
    output:
        outBAM = '{PATH}/{sample}.biobambam.sorted.bam'
    params:
        sortby = config['sortby_bbb'],
    log:
        log_samtools = config['logs']['bamUtil'] + '{PATH}/sortBAM_bbb_{sample}.' + '_local_time_func_'  + '.samtools.log',
        log_bamsort = config['logs']['bamUtil'] + '{PATH}/sortBAM_bbb_{sample}.' +'_local_time_func_' + '.bamsort.log'
    shell:
        'samtools view -b {input.inBAM} &> {log.log_samtools} | bamsort SO={params.sortby} tmpfile={output.outBAM}[:-4].tmp > {output.outBAM} &> {log.log_bamsort}'

Config file

sortby_bbb: "ASC"
logs:
 bamUtil: "logs/"

Directory structure:

.
├── config.yaml
├── my_path
│   └── another_level
│       ├── SXXXX.bam
│       └── SYYYY.bam
└── Snakefile

Command:

  snakemake my_path/another_level/{SXXXX,SYYYY}.biobambam.sorted.bam -np --configfile config.yaml

Dry run result:

rule sortBAM_bbb:
    input: my_path/another_level/SYYYY.bam
    output: my_path/another_level/SYYYY.biobambam.sorted.bam
    log: logs/my_path/another_level/sortBAM_bbb_SYYYY._local_time_func_.samtools.log, logs/my_path/another_level/sortBAM_bbb_SYYYY._local_time_func_.bamsort.log
    jobid: 0
    wildcards: PATH=my_path/another_level, sample=SYYYY

samtools view -b my_path/another_level/SYYYY.bam &> logs/my_path/another_level/sortBAM_bbb_SYYYY._local_time_func_.samtools.log | bamsort SO=ASC tmpfile=my_path/another_level/SYYYY.biobambam.sorted.bam[:-4].tmp > my_path/another_level/SYYYY.biobambam.sorted.bam &> logs/my_path/another_level/sortBAM_bbb_SYYYY._local_time_func_.bamsort.log

rule sortBAM_bbb:
    input: my_path/another_level/SXXXX.bam
    output: my_path/another_level/SXXXX.biobambam.sorted.bam
    log: logs/my_path/another_level/sortBAM_bbb_SXXXX._local_time_func_.samtools.log, logs/my_path/another_level/sortBAM_bbb_SXXXX._local_time_func_.bamsort.log
    jobid: 1
    wildcards: PATH=my_path/another_level, sample=SXXXX

samtools view -b my_path/another_level/SXXXX.bam &> logs/my_path/another_level/sortBAM_bbb_SXXXX._local_time_func_.samtools.log | bamsort SO=ASC tmpfile=my_path/another_level/SXXXX.biobambam.sorted.bam[:-4].tmp > my_path/another_level/SXXXX.biobambam.sorted.bam &> logs/my_path/another_level/sortBAM_bbb_SXXXX._local_time_func_.bamsort.log
Job counts:
    count   jobs
    2   sortBAM_bbb
    2

So be careful how are you calling for your target rule ;)

ADD COMMENTlink written 8 months ago by ale_abd30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 872 users visited in the last hour