Hi there,
After reading this snakemake documentation and processing several tests, I'm unable to solve my problem. I'm working on influenza. I have an association table between sample and their matching subtype, like that:
SAMPLE SUBTYPE
TH3pos20191217_S96 H3N2_Perth16
S1967_S46 pH1N1_California07
S1946_S32 pH1N1_California07
D1914_S14 H3N2_Perth16
Tneg20191217_S95 UNMAPPED
I'm trying to build a snakemake rule for a specific mapping with the corresponding reference. I thought of using a dictionary to associate my two wildcards, but I can't get anything functional. I think the problem lies in the definition of my wildcards. Do you have any suggestions?
Here is my current script:
import os
import pandas as pd
configfile:"config.yaml"
#Get information from config file
result_repository=config['Result_Repository']
#Select sample and assign subtype
sum_premapping = result_repository + "REPORT/SUBTYPING/Subtyping_result.csv"
table=pd.read_csv(sum_premapping,sep=";")
table = table.loc[table['SUBTYPE'] != "UNMAPPED"]
sample_list=list(table['SAMPLE'])
subtype_list=list(table['SUBTYPE'])
list_samplesub={}
for i in range(0,len(sample_list)):
list_samplesub[sample_list[i]] = subtype_list[i]
(SAMPLE)=sample_list
subtype=table.loc[table['SAMPLE'] == sample].value[0]
rule all:
input:
test= expand(result_repository + "MY_BAMs/{subtype}/{sample}.bam",subtype=list_samplesub[wildcards.sample],sample=SAMPLE)
rule test:
input:
viral_R1_gz = result_repository + 'DEHOSTING/{sample}_viral_R1.fastq.gz',
viral_R2_gz = result_repository + 'DEHOSTING/{sample}_viral_R2.fastq.gz',
FLU_subtype = "references/influenza/{subtype}.fasta"
output:
subtype_bam = result_repository + "MY_BAMs/{subtype}/{sample}.bam"
shell:
"minimap2 -ax sr {input.FLU_subtype} {input.viral_R1_gz} {input.viral_R2_gz} | samtools view -bS > {output.subtype_bam}"
Thank you all and stay safe,
Hadrien