How to get individual strand counts from Mutect2 for multi-allelic sites
Entering edit mode
2.2 years ago
nickeener ▴ 50

Hi all, I'm trying to pull strand count information from a VCF file made using GATK4's Mutect2. I used the following command to create this VCF:

gatk Mutect2 -I SRR8525881.bam -mbq 20 -R ../../genome/hxb2.fa --mitochondria-mode True -O SRR8525881.vcf

The VCF output for multi-allelic sites looks like this (I've emboldened the read depths and the strand count fields (strand counts are in this order: ref forward, ref reverse, alt forward, alt reverse)):

K03455.1 2042 . AG GA,GG . . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=56,45;OCM=0;POPAF=2.40,2.40;TLOD=35.34,20.29 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2:16,10,9:0.289,0.258:35:12,8,2:4,1,5:10,6,12,7

The strand counts given appear to be combined counts for both alternate alleles (12+7 = 10+9) so is there any way I can get strand counts for each alternate allele individually?

I've tried using GATK's VariantsToTable with the --splitMultiAllelic parameter and vcflib's vcfmulti program to split these multi allelic sites but get the following output for the same site:

K03455.1 2042 . AG GA 0 . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=56;OCM=0;POPAF=2.40;TLOD=35.34 GT:AD:AF:DP:F1R2:F2R1:SB ./0/1:16,10,9:0.289:35:12,8,2:4,1,5:10,6,12,7

K03455.1 2042 . AG GG 0 . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=45;OCM=0;POPAF=2.40;TLOD=20.29 GT:AD:AF:DP:F1R2:F2R1:SB ./0/1:16,10,9:0.258:35:12,8,2:4,1,5:10,6,12,7

As you can see, the alternate strand count info is the same combined total for both alleles.

GATK4 Mutect2 VCF • 864 views

Login before adding your answer.

Traffic: 1727 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6