Running python script in Snakemake without input/output
1
2
Entering edit mode
2.9 years ago
Hansen_869 ▴ 40

Hi I have a very simple question. Is it possible to run a python script in a snakefile, without specifying an output/input? I have a python script that renames some of my generated output. However, in the script is already specified input files. Is there a way to just run the script? My thought we be:

rule pythonscript:
shell:
"script.py"


Is this possible?

Snakemake • 6.0k views
3
Entering edit mode

I would say just give the rule the in and output files, why not if you have them. Otherwise snakemake never knows if something went wrong. If you really want to do this a temp file is maybe a solution https://stackoverflow.com/questions/45624969/is-there-a-way-to-chain-snakemake-rules-without-touch-files

0
Entering edit mode

Thanks for your answer. The problem in giving input and output, is that I don't know the name of the output files. The renaming is based on some other files, which vary according to my input file. So I can't tell snakemake the name of the output.

0
Entering edit mode

maybe this can also help, don't know the details myself. https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#dynamic-files I understand these things can be difficult in snakemake.

EDIT: You could also build in some kind of checks in the python script and if there are no errors the script creates a "successful_complete" file or something.

EDIT2: Another option is to create a wildcard before the rule.

0
Entering edit mode

Can you provide some examples of these input/output files?

0
Entering edit mode
3
Entering edit mode
2.9 years ago
Eric Lim ★ 1.9k

Not knowing the exact input or output at code time is a relatively common thing in bioinformatics, and I think snakemake handles it rather well.

Let's assume taxonomy.txt is formatted as below.

[~/Data/scratch/tmp/biostar/checkpoint]$cat taxonomy.txt bin unique multi tax 90-20-09-2018.001 25 15 Lactobacillus 90-20-09-2018.003 24 0 Streptococcus 90-20-09-2018.002 15 0 Lactobacillus_2  There are many ways to accomplish it, and below is an example usage of checkpoint (https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#data-dependent-conditional-execution ). [~/Data/scratch/tmp/biostar/checkpoint]$ cat example.py
import os, csv

def aggregate(wildcards):
"""aggregate paths from read_input_txt and return as list"""
return [loc.rstrip() for loc in fin]

rule:
input: aggregate

"""implement logic to turn taxonomy.txt into a list of desired files to create"""
input: 'taxonomy.txt'
output: 'files_to_be_created.txt'
run:
with open(input[0], 'r') as fin, open(output[0], 'w') as fout:
writer = csv.writer(fout)
writer.writerow([os.path.join('binned', data['bin'].split('.')[0], data['tax'])])

rule create_file:
"""implement logic to create the actual file"""
output: touch('{prefix}/{file}')


After snakemake -s example.py,

[~/Data/scratch/tmp/biostar/checkpoint]\$ tree binned/
binned/
└── 90-20-09-2018
├── Lactobacillus
├── Lactobacillus_2
└── Streptococcus

1 directory, 3 files


Hope you'll find this helpful and able to expand the example into a working solution.

0
Entering edit mode

Hi Eric After some tweaking, I think I have a sensible solution! Thanks for your input!