Entering edit mode
                    4.5 years ago
        kamanovae
        
    
        ▴
    
    100
    I am using snakelike to describe the GATK pipeline. Right now my code looks like this:
SAMPLES = glob_wildcards("input_bam/{sample}.bam")
rule all:
    input: expand("out/{sample}.AddOrReplaceReadGroups.bam", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.txt", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.bam", sample = SAMPLES.sample), expand("out/{sample}.BaseRecalibrator.txt", sample = SAMPLES.sample)
rule gatk_AddOrReplaceReadGroups:
    input: "input_bam/{sample}.bam"
    output: "out/{sample}.AddOrReplaceReadGroups.bam"
    shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx30g""  AddOrReplaceReadGroups -I {input}  -O {output} -ID group1 -SM NORMAL -PL illumina -LB lib1 -PU unit1"
rule gatk_MarkDuplicates:
    input: rules.gatk_AddOrReplaceReadGroups.output
    output: output1="out/{sample}.MarkDuplicates.bam", output2="out/{sample}.MarkDuplicates.txt"
    shell: "./gatk-4.1.9.0/gatk --java-options  ""-Xmx4g""  MarkDuplicates  -I {input}  -O {output.output1} -M {output.output2} --CREATE_INDEX true"
rule gatk_BaseRecalibrator:
    input:  input1="intervals/0000-scattered.bed", input2=rules.gatk_MarkDuplicates.output.output1, input3='ref/ref.fa', input4='dbsnp/dbsnp_150.hg38.vcf.gz'
    output: output1="out/{sample}.BaseRecalibrator.txt"
    shell: "./gatk --java-options ""-Xmx7680m""  BaseRecalibrator -L {input1} -I {input2} -O {output.output1} -R {input3} --known-sites {input4}"
instead of one file in the gatk_BaseRecalibrator.output1(intervals/0000-scattered.bed) at the entrance, I want to use all the files that I mess up in the intervals folder. How can I do this correctly?