Question: GATK: RealignerTargetCreator and IndelRealigner related questions
gravatar for iraun
5.2 years ago by
iraun3.7k wrote:

Hi there,

I'm starting to use GATK tool and I've two general questions (sorry if they are basic but I can not find the correct way to proceed).
I've aligned 10 bam files of different samples against human_g1k_v37.fasta genome. Now I want to perform "RealignerTargetCreator" and "IndelRealigner" steps in order to get as good as possible variant calling results. Questions:

1) What file should I put in "-known" argument? Mills_and_1000G_gold_standard.indels.b37.vcf or 000G_phase1.indels.b37.vcf (both are available in bundle)? Maybe both? Or I should create my own file?

2) I have to run RealignerTargetCreator for each bam...?

3) IndelRealigner, I have to put the same -known file as in RealignerTargetCreator step?


Thanks in advance for the help.

gatk • 4.5k views
ADD COMMENTlink modified 5.2 years ago by Devon Ryan94k • written 5.2 years ago by iraun3.7k
gravatar for Devon Ryan
5.2 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:
  1. GATK recommends using both (I think you just specify -known twice).
  2. I'm pretty sure that you can specify all of your BAM files in one go (this should actually work better, since I think there's some thresholding that goes on).
  3. Yup, use the same ones.
ADD COMMENTlink written 5.2 years ago by Devon Ryan94k

In 2., do you mean that in the same run you can put as input more than one bam no? I mean, at the end, you should have only one "intervals" file instead of one file for each bam.

ADD REPLYlink written 5.2 years ago by iraun3.7k

Correct, you'd end up with a single intervals file then.

ADD REPLYlink written 5.2 years ago by Devon Ryan94k (Which data processing steps should I do per-lave vs. per-sample?)

"People often ask also if it's worth the trouble to try realigning across all samples in a cohort. The answer is almost always no, unless you have very shallow coverage. The problem is that while it would be lovely to ensure consistent alignments around indels across all samples, the computational cost gets too ridiculous too fast. That being said, for contrastive calling projects -- such as cancer tumor/normals -- we do recommend realigning both the tumor and the normal together in general to avoid slight alignment differences between the two tissue types."

ADD REPLYlink written 5.2 years ago by rbagnall1.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 748 users visited in the last hour