How do I organize snakemake when not all jobs successfully output files from previous rule?
Entering edit mode
10 weeks ago

Basically, I have three snakemake rules (other than rule all) and cannot figure this problem out, despite the checkpoint resources.

Rule one has my first and only file that I start with. It will have x outputs (the number varies depending on the input file). Each of those x outputs needs to be processed separately in rule 2, meaning that rule 2 will run x jobs. However, only some subset, y, of these jobs will produce outputs (the software only writes out files for inputs that pass a certain threshold). So, while I want each of those outputs to run as a separate job in job 3, I don't know how many files will come out of rule 2. Rule three will also run y jobs, one for each successful output from rule 2. I have two questions. The first is how do I write the input for rule 3, not knowing how many files will come out of rule two? The second question is how can I "tell" rule 2 it is done, when there is not a corresponding number of output files to the input files? If I add a fourth rule, I imagine it would try to re-run rule two on jobs that didn't get an output file, which would never make an output. Maybe I am missing something with setting up the checkpoints?

something like:

    #this rule has one job
    rule a:
         input: file.vcf
         output: some unkown number of files
          .... make unknown number of output files (x) x_1 , x_2, ..., x_n 
    #run a separate job from each output of rule a
    rule b:
         input: x_1 #not sure how many are going to be inputs here
         output: y_1 #not sure how many output files will be here
          some of the x inputs will output their corresponding y, but others will have no output
   #run a separate job in rule c for each output of rule b
    rule c:
         input: y_1 #not sure how many input files here
         output: z_1
snakemake workflow python • 199 views
Entering edit mode

Input functions might be what you are looking for. There is a lot on there is you google "dynamic snakemake input" etc. in my experience snakemake was actually pretty limited until I learned how to make input functions


Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6