Question: Question: Is samtools BAQ redundant after GATK's IndelRealinger
gravatar for James Reeve
3 months ago by
James Reeve80
James Reeve80 wrote:

I'm planning to generate a mpileup file for SNP calling. When I look for examples online I notice many turn off samtools' base alignment quality (BAQ) setting by specifying -B


samtools -B -f ref.fa data.bam > out.mpileup

After doing a bit of research I seems BAQ is designed to adjust quality scores to account for indels (original paper). However, I have already used GATK 3's IndelRealigner to account for indels.

Are samtools BAQ and GATK's IndelRealigner analogous commands for dealing with indels? Is it safe to turn off BAQ after IndelRealignment?

samtools indels • 257 views
ADD COMMENTlink written 3 months ago by James Reeve80

The variant calling workflow recommended in the SAMtools webpage mixes GATK's IndelRealigner and bcftools mpileup (with BAQ). I don't know if this pipeline is up to date, though. I'm curious about whether mixing GATK and SAMtools is still the best option if you want to perform the actual variant calling using SAMtools.

As for turning off BAQ (-B option), I have read that it is recommended if you want to perform somatic variant calling using VarScan2 (see for example the Genomic Data Commons user's guide). However, I am not aware of people turning BAQ off for other types of analyses.

ADD REPLYlink modified 3 months ago • written 3 months ago by Álvaro Andrades20

Do you have any idea why VarScan2 recommends turning off mpileup -B? I looked at your source and couldn't find an explanation.

ADD REPLYlink written 3 months ago by James Reeve80

I believe this thread is one of the first in which they concluded that it was best to use mpileup -B for VarScan2. Apparently, not using -B makes VarScan miss true variants, but be aware that if you use mpileup -B you may get more false positives. I guess this explanation might be applicable to other variant calling pipelines that rely on samtools, such as yours, but I'm not 100% sure. Also note that the thread does not mention whether the user performed indel realignment with GATK before using samtools.

This manuscript from VarScan2's creator also recommends using mpileup -B as "best practice", but it doesn't give an explanation for this recommendation.

ADD REPLYlink written 3 months ago by Álvaro Andrades20

You would likely need the developer(s) of both SAMtools and GATK to be here to adequately answer this. Just generally, though, I would not mix and match these pipelines, where possible. Why are you using a mixture of GATK and SAMtools? I have neither heard of people disabling BAQ...

ADD REPLYlink written 3 months ago by Kevin Blighe44k

I'm mixing GATK's indel realignment and SAMtools because Schlötterer et al. 2014 in Box1 advises using IndelRealigner for a pool-seq pipeline. They also disable BAQ using the mpileup -B option in their tutorial for their variant caller / analysis toolkit PoPoolation2.

ADD REPLYlink written 3 months ago by James Reeve80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour