How to input list into GenomicsDBImport with snakemake?
0
0
Entering edit mode
5 months ago
ema ▴ 10

Hello!

I'm currently writing a pipeline with snakemake for exome data. During joint variant calling I need to use GATK's GenomicsDBImport, although I'm unsure how to input all the samples at once. Here's the simplified version of the rule I'm using:

rule GenomicsDBImport:
    input:
        gvcf = expand("variant_call/{sample}_raw_variants.g.vcf", sample=SAMPLE),
        ref = REF
    output:
        dir = "GDBI_database"
    shell:
        """
        ({GATK} GenomicsDBImport  -R {input.ref} -V {input.gvcf} --genomicsdb-workspace-path {output.dir}) 2> {log}
        """

From my understanding, the expand function gives me a list of all the sample names as strings. My question is: can the '-V' argument take a list as input? There's also the option to use a snakemake wrapper, but I'm unfamiliar with that method.

Thanks in advance!

snakemake GenomicsDBImport GATK VCF • 409 views
ADD COMMENT
1
Entering edit mode

You'd need to take the map-file route, I think. The wrapper does a better job using this line:

gvcfs = list(map("--variant {}".format, snakemake.input.gvcfs))

You could add that code and try if this works:

input:
        gvcf = list(map("--variant {}".format, expand("variant_call/{sample}_raw_variants.g.vcf", sample=SAMPLE))),
        ref = REF

The above code is just theoretical, completely untested.

ADD REPLY

Login before adding your answer.

Traffic: 1938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6