How to obtain the combined callable regions BED file via BCBio variant-calling pipeline?
3.1 years ago

I ran BCBio's variant2 pipeline on 10 input (sorted) bams (via a slightly modified gatk-variant.yaml config file) and am trying to obtain a combined callable regions file across all of the samples. So far, I have several BED files generated per sample, but at what point will the globally shared callable regions BED file be generated?

I see that there's a function called combine_sample_regions(*samples) in is this where the globally shared callable regions file gets produced? If so, at what point in the pipeline does it get implemented or what YAML config details do I need to add for it to occur?

yaml config file:

  - analysis: variant2
    genome_build: hg38
    # to do multi-sample variant calling, assign samples the same metadata / batch
    # metadata:
    #   batch: your-arbitrary-batch-name
      aligner: false
      bam_clean: fixrg
      mark_duplicates: true
      recalibrate: false
      variantcaller: false
      nomap_split_size: 250 #this is the default number of unmapped base pairs
      nomap_split_targets: 200 #this is the default number of target intervals 
      # for targetted projects, set the region
      # variant_regions: /path/to/your.bed
