Question

take user options before calling snakemake

0

Entering edit mode

12 months ago

H.Hasani ▴ 990

Hi all,

I'm writing a workflow that depends on user's input. For example, if the user choses analysis A, the scripts 1,2,3 will run while chosing B will run the scripts 2,5,7.

My idea is first to create a python script in which I import argparse to provide the options needed. the script will then write the config file for snakemake. I think this way it is very clean, no interwine between different parts (I've seen other posts), and it is scalable.

The only part that I don't know how to implement is how to call snakemake from within this script? the user should not notice that these are two different steps.

my example is

def inputArgs():
      parser = argparse.ArgumentParser() 
      parser.add_argumnt("--analysis", help = "enter the name of analysis", type = str, default = True)

      return parser.parse_args()

if __name__ == '__main__':
       # write args into file
       config_file = open("config.yaml", "w")
       args = inputArgs()
       for a in args.__dict__:
             config_file.write(a + ":" + str(args.__dict__[a]))

       # call snakemake
        ???

Thanks for the help

args snakemake sequencing python analysis • 720 views

ADD COMMENT • link 12 months ago by H.Hasani ▴ 990

0

Entering edit mode

Not sure what you're looking for when you mention "the user should not notice that these are two different steps." Are you saying you don't want your users to know what rules were invoked to generate the outputs?

Btw, you can do it natively within snakemake using --config. If you run the following with snakemake -j1 --config analysis=A, it'll trigger rules script1 and script2, whereas B -> script2 and script5.

try:
    analysis = config['analysis']
except KeyError:
    print('help message')
    analysis_outputs = []
else:
    if analysis == 'A':
        analysis_outputs = ['script1.output.txt', 'script2.output.txt']
    if analysis == 'B':
        analysis_outputs = ['script2.output.txt', 'script5.output.txt']

rule analysis:
    input: analysis_outputs

rule script1:
    output: touch('script1.output.txt')

rule script2:
    output: touch('script2.output.txt')

rule script5:
    output: touch('script5.output.txt')

ADD REPLY • link 12 months ago by Eric Lim ★ 2.1k

0

Entering edit mode

thanks for the answer, no that is not my intention! the user are presented with some options, these are generic and analysis-related, in the background these options will determin how the scripts will behave. The options are not related to snakemake directly, that is why the options provided by snakemake can't help me here. The idea is that the user calls for one script provides the options that are meaningful to him and that is it, rather than changing the config file everytime. This is for example very helpful if your input changes so you want to provide different path. Maybe I'm missing something, but these kind of parameters I typically write them in the config file; the option config in snakemake requires that you write the options as dictionary which contradict my intention

ADD REPLY • link 12 months ago by H.Hasani ▴ 990

score 1 · Accepted Answer · 2023-04-11

You can do this with Snakemake's snakemake function in your script. The list of arguments it takes is intimidatingly massive at first glance, but you can see how they line up with the usual command-line arguments, like Snakefile and configfiles. One tool I know of that takes this approach is IgDiscover. Buried deep in its CLI code is the part that actually calls Snakemake.

You also don't necessarily have to write out a config file and pass it back in to Snakemake, too. You could pass config options directly to that function.