Running python script in Snakemake without input/output
1
3
Entering edit mode
4.5 years ago
Hansen_869 ▴ 80

Hi I have a very simple question. Is it possible to run a python script in a snakefile, without specifying an output/input? I have a python script that renames some of my generated output. However, in the script is already specified input files. Is there a way to just run the script? My thought we be:

rule pythonscript:
    shell:
        "script.py"

Is this possible?

Snakemake • 9.4k views
ADD COMMENT
4
Entering edit mode

I would say just give the rule the in and output files, why not if you have them. Otherwise snakemake never knows if something went wrong. If you really want to do this a temp file is maybe a solution https://stackoverflow.com/questions/45624969/is-there-a-way-to-chain-snakemake-rules-without-touch-files

ADD REPLY
0
Entering edit mode

Thanks for your answer. The problem in giving input and output, is that I don't know the name of the output files. The renaming is based on some other files, which vary according to my input file. So I can't tell snakemake the name of the output.

ADD REPLY
0
Entering edit mode

maybe this can also help, don't know the details myself. https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#dynamic-files I understand these things can be difficult in snakemake.

EDIT: You could also build in some kind of checks in the python script and if there are no errors the script creates a "successful_complete" file or something.

EDIT2: Another option is to create a wildcard before the rule.

ADD REPLY
0
Entering edit mode

Can you provide some examples of these input/output files?

ADD REPLY
0
Entering edit mode
ADD REPLY
5
Entering edit mode
4.5 years ago
Eric Lim ★ 2.1k

Not knowing the exact input or output at code time is a relatively common thing in bioinformatics, and I think snakemake handles it rather well.

Let's assume taxonomy.txt is formatted as below.

[~/Data/scratch/tmp/biostar/checkpoint]$ cat taxonomy.txt 
bin unique  multi   tax
90-20-09-2018.001   25  15  Lactobacillus
90-20-09-2018.003   24  0   Streptococcus
90-20-09-2018.002   15  0   Lactobacillus_2

There are many ways to accomplish it, and below is an example usage of checkpoint (https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#data-dependent-conditional-execution ).

[~/Data/scratch/tmp/biostar/checkpoint]$ cat example.py
import os, csv

def aggregate(wildcards):
    """aggregate paths from `read_input_txt` and return as list""" 
    with open(checkpoints.read_input_txt.get(**wildcards).output[0], 'r') as fin:
        return [loc.rstrip() for loc in fin]

rule:
    input: aggregate

checkpoint read_input_txt:
    """implement logic to turn `taxonomy.txt` into a list of desired files to create"""
    input: 'taxonomy.txt'
    output: 'files_to_be_created.txt'
    run:
        with open(input[0], 'r') as fin, open(output[0], 'w') as fout:
            reader = csv.DictReader(fin, delimiter='\t')
            writer = csv.writer(fout)
            for data in reader:
                writer.writerow([os.path.join('binned', data['bin'].split('.')[0], data['tax'])])

rule create_file:
    """implement logic to create the actual file"""
    output: touch('{prefix}/{file}')

After snakemake -s example.py,

[~/Data/scratch/tmp/biostar/checkpoint]$ tree binned/
binned/
└── 90-20-09-2018
    ├── Lactobacillus
    ├── Lactobacillus_2
    └── Streptococcus

1 directory, 3 files

Hope you'll find this helpful and able to expand the example into a working solution.

ADD COMMENT
0
Entering edit mode

Hi Eric After some tweaking, I think I have a sensible solution! Thanks for your input!

ADD REPLY

Login before adding your answer.

Traffic: 2372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6