Question: Is indel realigning necessary for INDEL discovery?
0
gravatar for deepti1rao
17 months ago by
deepti1rao20
deepti1rao20 wrote:

I understand that GATK's Indel realigner tool helps in finding the right snps. But, does one need to use it when finding only indels?

indelrealigner gatk indel • 1.0k views
ADD COMMENTlink modified 17 months ago by Jorge Amigo11k • written 17 months ago by deepti1rao20
2
gravatar for Jorge Amigo
17 months ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local de novo assembly (kind of a local realignment) to call variants, although it doesn't output any realigned bam. so, in summary, there's no need to use IndelRealigner if you are going to call variants through HaplotypeCaller.

that being said, note that GATK4 has removed IndelRealigner from its guts as it is not needed anymore... if you are going to use GATK's pipeline. as Devon says, using IndelRealigner does still make sense if you want to use any other variant caller (including GATK's UnifiedGenotyper) for whatever reason (GATK4's HaplotypeCaller definitely produces higher confidence calls than samtools+bcftools).

ADD COMMENTlink modified 17 months ago • written 17 months ago by Jorge Amigo11k
1

Hello,

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local realignments to call variants (although it doesn't output any realigned bam)

I often read that people say the HaplotypeCaller is doing local realignment. But that's not true. It's doing local de-novo assembly.

From the manual:

The HaplotypeCaller is capable of calling SNPs and indels simultaneously via local de-novo assembly of haplotypes in an active region. In other words, whenever the program encounters a region showing signs of variation, it discards the existing mapping information and completely reassembles the reads in that region.

fin swimmer

ADD REPLYlink written 17 months ago by finswimmer12k
1

I question whether it's really de novo. I presume they're putting the reference sequence into their de Bruijn graph too (at least that's what I've done when implementing this sort of thing).

ADD REPLYlink written 17 months ago by Devon Ryan91k

I must agree with you both. I've updated my answer to be more precise on what GATK states and how I personally have always considered it. thank you for the clarification.

ADD REPLYlink written 17 months ago by Jorge Amigo11k

It's not 'true' de novo assembly as the reads being assembled have already been mapped to a particular region of the genome. I've done a similar thing for SV calling and I did it without incorporating the reference sequence into the de Bruijn Graph. Whether you do or not, your assembly is already biased towards the reference allele due to the mapping step.

ADD REPLYlink written 17 months ago by d-cameron2.1k
1
gravatar for Devon Ryan
17 months ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

You don't even need to use it when finding SNPs, so there's no reason to use it for finding InDels. It's mostly still around for those using the unified genotyper rather than the haplotype caller.

ADD COMMENTlink written 17 months ago by Devon Ryan91k

I am using samtools mpileup and bcftools to call variants. In this context, I want to know if i should use the indelrealigner. Alternatively, do you suggest switching to GATK for variant calling, using haplotype caller?

ADD REPLYlink written 17 months ago by deepti1rao20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2121 users visited in the last hour