Deleted:Snakemake input/output wildcards from Pandas dataframe
1
1
Entering edit mode
2.2 years ago
Shred ★ 1.4k

Hi,

Sorry if this question has already answers outside, but I'm struggling to find a solution. I've got a csv file with information about experiment, with a scheme like:

Sample,condition,patient
132,tumor,A
133,control,A
134,tumor,B
....

Where each patient has got two file, a tumor and a control one: a fairly common scenario. In snakemake I've got a rule which needs to have as input both file ({sample}.bam), and needs to use patient to define output filename, like:

rule Mutation:
    input:
        tumor = 
        control = 
    output:
        vcf = "{patient}.vcf"

I've managed to import the csv in Pandas to handle experiment information as a dataframe. I've tried a solution for tumor and control input by writing a python function that returns a list, where each element is the concatenation between path+sample+extension, and I've passed it via rule all. This solution doesn't solve the problem, because I need also the patient information to be used as an output parameter, as the example. Another solution may be to edit filename before passing them to the rule, with a schema like {sample}_{condition}_{patient}.bam, in order to use the same wildcards across input/output..

Is there a better solution?

Workflow Snakemake Python • 975 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 3720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6