Question: snakemake : associating 2 wildcards for a specific mapping
0
gravatar for regue.hadrien
8 weeks ago by
regue.hadrien30 wrote:

Hi there,

After reading this snakemake documentation and processing several tests, I'm unable to solve my problem. I'm working on influenza. I have an association table between sample and their matching subtype, like that:

SAMPLE  SUBTYPE
TH3pos20191217_S96  H3N2_Perth16
S1967_S46   pH1N1_California07
S1946_S32   pH1N1_California07
D1914_S14   H3N2_Perth16
Tneg20191217_S95    UNMAPPED

I'm trying to build a snakemake rule for a specific mapping with the corresponding reference. I thought of using a dictionary to associate my two wildcards, but I can't get anything functional. I think the problem lies in the definition of my wildcards. Do you have any suggestions?

Here is my current script:

import os
import pandas as pd

configfile:"config.yaml"

#Get information from config file
result_repository=config['Result_Repository']

#Select sample and assign subtype
sum_premapping = result_repository + "REPORT/SUBTYPING/Subtyping_result.csv"
table=pd.read_csv(sum_premapping,sep=";") 
table = table.loc[table['SUBTYPE'] != "UNMAPPED"]

sample_list=list(table['SAMPLE'])
subtype_list=list(table['SUBTYPE'])

list_samplesub={}
for i in range(0,len(sample_list)):
    list_samplesub[sample_list[i]] = subtype_list[i]

(SAMPLE)=sample_list

subtype=table.loc[table['SAMPLE'] == sample].value[0]

rule all:
    input:
        test= expand(result_repository + "MY_BAMs/{subtype}/{sample}.bam",subtype=list_samplesub[wildcards.sample],sample=SAMPLE)

rule test:
    input:
        viral_R1_gz = result_repository + 'DEHOSTING/{sample}_viral_R1.fastq.gz',
        viral_R2_gz = result_repository + 'DEHOSTING/{sample}_viral_R2.fastq.gz',
        FLU_subtype = "references/influenza/{subtype}.fasta"

    output:
        subtype_bam = result_repository + "MY_BAMs/{subtype}/{sample}.bam"
    shell:
        "minimap2 -ax  sr {input.FLU_subtype} {input.viral_R1_gz} {input.viral_R2_gz} | samtools view -bS > {output.subtype_bam}"

Thank you all and stay safe,

Hadrien

snakemake wildcards • 167 views
ADD COMMENTlink modified 7 weeks ago • written 8 weeks ago by regue.hadrien30
2
gravatar for regue.hadrien
7 weeks ago by
regue.hadrien30 wrote:

After few days of trying and searching, I found something:

#Output bams after specifics reference mapping
subtype_bam=[]
for i in range(0,len(sample_list)):
    subtype_bam.append(result_repository + "FULL_SUBTYPED_BAM/"+subtype_list[i]+"/"+sample_list[i]+".bam")

then:

rule all:
    input:
        annoted_bam,
rule complete_subtype_mapping:
    input:
        viral_R1_gz = result_repository + 'DEHOSTING/{sample}_viral_R1.fastq.gz',
        viral_R2_gz = result_repository + 'DEHOSTING/{sample}_viral_R2.fastq.gz',
        FLU_subtype = "references/influenza/{subtype}.fasta"
    conda:
        "envs/minimap2.yaml"
    output:
        subtype_bam = result_repository + "FULL_SUBTYPED_BAM/{subtype}/{sample}.bam"
    shell:
        "minimap2 -ax  sr {input.FLU_subtype} {input.viral_R1_gz} {input.viral_R2_gz} | samtools sort  > {output.subtype_bam}"

I cant find the biostars link guiding me for this code, I'll edit later.

EDIT: this was on StackOverflow:solution

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by regue.hadrien30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 667 users visited in the last hour