I am running samtools mpileup and get an output for the requested positions. My problem is that if there are multiple alternative bases, it's not possible to extract the alternative base count for each.
The command line that I use:
samtools mpileup -u -Q 0 -t DP4 -t DV -t DP -f GRCh37.fa --positions positions.bed SAMPLE_1-ready.bam SAMPLE_2-ready.bam SAMPLE_3-ready.bam | bcftools view > SAMPLE-variant-sites.vcf
An example output:
12 25398284 . C G,T,<X> 0 . DP=612;I16=351,184,53,24,19737,1.05346e+06,2839,140417,32100,1.926e+06,4620,277200,9893,221105,1557,36041;QS=2.58612,0.40494,0.0089373,0;VDB=0.0247352;SGB=4.66664;RPB=0.97648;MQB=1;MQSB=1;BQB=0.965306;MQ0F=0 PL:DP:DV:DP4 0,255,255,255,255,255,255,255,255,255:208:3:133,72,2,1 255,0,255,255,255,255,255,255,255,255:186:74:74,38,51,23 0,255,255,255,255,255,255,255,255,255:218:0:144,74,0,0
As you may see, there are G as well as T alternative reads . And if we look at the data from SAMPLE_1, we can see that there is only one DP4 field:
133,72,2,1. So we know that three bases are either G or T in that position but we do not know how bases are G and how many are T
How would I achieve this??