Question

Deleted:Snakemake input/output wildcards from Pandas dataframe

1

Entering edit mode

2.2 years ago

Shred ★ 1.4k

Hi,

Sorry if this question has already answers outside, but I'm struggling to find a solution. I've got a csv file with information about experiment, with a scheme like:

Sample,condition,patient
132,tumor,A
133,control,A
134,tumor,B
....

Where each patient has got two file, a tumor and a control one: a fairly common scenario. In snakemake I've got a rule which needs to have as input both file ({sample}.bam), and needs to use patient to define output filename, like:

rule Mutation:
    input:
        tumor = 
        control = 
    output:
        vcf = "{patient}.vcf"

I've managed to import the csv in Pandas to handle experiment information as a dataframe. I've tried a solution for tumor and control input by writing a python function that returns a list, where each element is the concatenation between path+sample+extension, and I've passed it via rule all. This solution doesn't solve the problem, because I need also the patient information to be used as an output parameter, as the example. Another solution may be to edit filename before passing them to the rule, with a schema like {sample}_{condition}_{patient}.bam, in order to use the same wildcards across input/output..

Is there a better solution?

Workflow Snakemake Python • 975 views

ADD COMMENT • link 2.2 years ago by Shred ★ 1.4k