samtools mpileup - how to extract the alternative allele counts for *each* alternative base
2
4
Entering edit mode
6.4 years ago
Michi ▴ 980

Hello there

I am running samtools mpileup and get an output for the requested positions. My problem is that if there are multiple alternative bases, it's not possible to extract the alternative base count for each.

The command line that I use:

samtools mpileup -u -Q 0 -t DP4 -t DV -t DP -f GRCh37.fa  --positions positions.bed SAMPLE_1-ready.bam SAMPLE_2-ready.bam  SAMPLE_3-ready.bam | bcftools view  > SAMPLE-variant-sites.vcf


An example output:

12  25398284    .   C   G,T,<X> 0   .   DP=612;I16=351,184,53,24,19737,1.05346e+06,2839,140417,32100,1.926e+06,4620,277200,9893,221105,1557,36041;QS=2.58612,0.40494,0.0089373,0;VDB=0.0247352;SGB=4.66664;RPB=0.97648;MQB=1;MQSB=1;BQB=0.965306;MQ0F=0 PL:DP:DV:DP4    0,255,255,255,255,255,255,255,255,255:208:3:133,72,2,1  255,0,255,255,255,255,255,255,255,255:186:74:74,38,51,23    0,255,255,255,255,255,255,255,255,255:218:0:144,74,0,0


As you may see, there are G as well as T alternative reads . And if we look at the data from SAMPLE_1, we can see that there is only one DP4 field: 133,72,2,1. So we know that three bases are either G or T in that position but we do not know how bases are G and how many are T

How would I achieve this??

samtools mpileup alt mutation • 9.1k views
5
Entering edit mode
6.4 years ago

You'll need to update to the latest samtools/bcftools (1.3) and then there are new mpileup tag (-t) options -- AD, ADF, ADR which will give per-allele depth (AD) per-allele depth on the forward (ADF) and reverse (ADR) strands.

0
Entering edit mode

Great! You saved my day. I was still using v1.2 and could not figure out how to get this info.

0
Entering edit mode

By the way - did you ever notice problems for calling the correct AD for deletions? I experience many cases where the AD is equal to DP meanwhile in truth the deletion is present in only a fraction of the reads. It puzzles me as insertions seem to be called correctly and other deletions as well.

0
Entering edit mode
2.5 years ago
Ks-seq ▴ 10

If I can ask: How get Allele fraction= reads supporting a variant (AD) / total reads (DP) ? and possibly in percentage ?