Question: Is indel realigning necessary for INDEL discovery?
0
gravatar for deepti1rao
2.8 years ago by
deepti1rao30
deepti1rao30 wrote:

I understand that GATK's Indel realigner tool helps in finding the right snps. But, does one need to use it when finding only indels?

indelrealigner gatk indel • 2.3k views
ADD COMMENTlink modified 2.8 years ago by Jorge Amigo12k • written 2.8 years ago by deepti1rao30
2
gravatar for Jorge Amigo
2.8 years ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local de novo assembly (kind of a local realignment) to call variants, although it doesn't output any realigned bam. so, in summary, there's no need to use IndelRealigner if you are going to call variants through HaplotypeCaller.

that being said, note that GATK4 has removed IndelRealigner from its guts as it is not needed anymore... if you are going to use GATK's pipeline. as Devon says, using IndelRealigner does still make sense if you want to use any other variant caller (including GATK's UnifiedGenotyper) for whatever reason (GATK4's HaplotypeCaller definitely produces higher confidence calls than samtools+bcftools).

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by Jorge Amigo12k
1

Hello,

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local realignments to call variants (although it doesn't output any realigned bam)

I often read that people say the HaplotypeCaller is doing local realignment. But that's not true. It's doing local de-novo assembly.

From the manual:

The HaplotypeCaller is capable of calling SNPs and indels simultaneously via local de-novo assembly of haplotypes in an active region. In other words, whenever the program encounters a region showing signs of variation, it discards the existing mapping information and completely reassembles the reads in that region.

fin swimmer

ADD REPLYlink written 2.8 years ago by finswimmer14k
1

I question whether it's really de novo. I presume they're putting the reference sequence into their de Bruijn graph too (at least that's what I've done when implementing this sort of thing).

ADD REPLYlink written 2.8 years ago by Devon Ryan98k

I must agree with you both. I've updated my answer to be more precise on what GATK states and how I personally have always considered it. thank you for the clarification.

ADD REPLYlink written 2.8 years ago by Jorge Amigo12k

It's not 'true' de novo assembly as the reads being assembled have already been mapped to a particular region of the genome. I've done a similar thing for SV calling and I did it without incorporating the reference sequence into the de Bruijn Graph. Whether you do or not, your assembly is already biased towards the reference allele due to the mapping step.

ADD REPLYlink written 2.8 years ago by d-cameron2.3k
1
gravatar for Devon Ryan
2.8 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

You don't even need to use it when finding SNPs, so there's no reason to use it for finding InDels. It's mostly still around for those using the unified genotyper rather than the haplotype caller.

ADD COMMENTlink written 2.8 years ago by Devon Ryan98k

I am using samtools mpileup and bcftools to call variants. In this context, I want to know if i should use the indelrealigner. Alternatively, do you suggest switching to GATK for variant calling, using haplotype caller?

ADD REPLYlink written 2.8 years ago by deepti1rao30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1582 users visited in the last hour
_