Question: samtools mpileup - how to extract the alternative allele counts for *each* alternative base
gravatar for Michi
4.7 years ago by
Michi970 wrote:

Hello there

I am running samtools mpileup and get an output for the requested positions. My problem is that if there are multiple alternative bases, it's not possible to extract the alternative base count for each.

The command line that I use:

samtools mpileup -u -Q 0 -t DP4 -t DV -t DP -f GRCh37.fa  --positions positions.bed SAMPLE_1-ready.bam SAMPLE_2-ready.bam  SAMPLE_3-ready.bam | bcftools view  > SAMPLE-variant-sites.vcf

An example output:

12  25398284    .   C   G,T,<X> 0   .   DP=612;I16=351,184,53,24,19737,1.05346e+06,2839,140417,32100,1.926e+06,4620,277200,9893,221105,1557,36041;QS=2.58612,0.40494,0.0089373,0;VDB=0.0247352;SGB=4.66664;RPB=0.97648;MQB=1;MQSB=1;BQB=0.965306;MQ0F=0 PL:DP:DV:DP4    0,255,255,255,255,255,255,255,255,255:208:3:133,72,2,1  255,0,255,255,255,255,255,255,255,255:186:74:74,38,51,23    0,255,255,255,255,255,255,255,255,255:218:0:144,74,0,0

As you may see, there are G as well as T alternative reads . And if we look at the data from SAMPLE_1, we can see that there is only one DP4 field: 133,72,2,1. So we know that three bases are either G or T in that position but we do not know how bases are G and how many are T

How would I achieve this??

mpileup mutation alt samtools • 6.6k views
ADD COMMENTlink modified 8 months ago by Ks-seq10 • written 4.7 years ago by Michi970
gravatar for Shane McCarthy
4.7 years ago by
Cambridge, Cambridgeshire
Shane McCarthy340 wrote:

You'll need to update to the latest samtools/bcftools (1.3) and then there are new mpileup tag (-t) options -- AD, ADF, ADR which will give per-allele depth (AD) per-allele depth on the forward (ADF) and reverse (ADR) strands.

ADD COMMENTlink written 4.7 years ago by Shane McCarthy340

Great! You saved my day. I was still using v1.2 and could not figure out how to get this info.

ADD REPLYlink written 4.6 years ago by Michi970

By the way - did you ever notice problems for calling the correct AD for deletions? I experience many cases where the AD is equal to DP meanwhile in truth the deletion is present in only a fraction of the reads. It puzzles me as insertions seem to be called correctly and other deletions as well.

ADD REPLYlink written 4.6 years ago by Michi970
gravatar for Ks-seq
8 months ago by
Ks-seq10 wrote:

If I can ask: How get Allele fraction= reads supporting a variant (AD) / total reads (DP) ? and possibly in percentage ?

ADD COMMENTlink written 8 months ago by Ks-seq10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1526 users visited in the last hour