Question: Gatk Indel Realignment Taking Forever! Help
0
gravatar for Ds89
5.7 years ago by
Ds890
Ds890 wrote:

Hi all, I am working on exome capture data for barley (1.3Gbp). I am interested in variant calling to find out SNPs in my sample. I have used SAMTools SNP calling and things get done in ~1 hr whereas GATK (inspite of its several steps to prepare the BAM for variant caller) takes forever. I understand my reference is large and since its an exome capture the targeted region is only 60 Mbp of 1.3Gbp. Indel realigner is the step it takes forever to locate for sites where indel realignment is required. Do someone have any suggestions to speed it up? Or try any other variant caller?

Thanks, D

indel gatk bam • 2.3k views
ADD COMMENTlink modified 5.7 years ago by Pierre Lindenbaum119k • written 5.7 years ago by Ds890
2
gravatar for Pierre Lindenbaum
5.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

Do someone have any suggestions to speed it up?

  • split your BAM / chromosome
  • use the option -L (limit-to-region) of GATK to limit the regions to realign/recalibrate
ADD COMMENTlink written 5.7 years ago by Pierre Lindenbaum119k

You actually don't need to split per chr. Just index it :-)

ADD REPLYlink written 5.7 years ago by Gabriel R.2.6k

I do split per chromosome just after BWA: it's then faster for sorting and removing the duplicates.

ADD REPLYlink written 5.7 years ago by Pierre Lindenbaum119k

Thanks both. Also I found from GATK forum that downsampling the reads based on coverage can sometimes help to speed up. I am currently trying both the approaches will get back if that has changed the runtime.

ADD REPLYlink written 5.7 years ago by Ds890
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1181 users visited in the last hour