Entering edit mode
9.2 years ago
iraun
6.2k
Hi there,
I'm starting to use GATK tool and I've two general questions (sorry if they are basic but I can not find the correct way to proceed).
I've aligned 10 bam files of different samples against human_g1k_v37.fasta
genome. Now I want to perform "RealignerTargetCreator" and "IndelRealigner" steps in order to get as good as possible variant calling results. Questions:
- What file should I put in
-known
argument?Mills_and_1000G_gold_standard.indels.b37.vcf
or1000G_phase1.indels.b37.vcf
(both are available in bundle)? Maybe both? Or I should create my own file? - I have to run RealignerTargetCreator for each bam...?
- IndelRealigner, I have to put the same
-known
file as in RealignerTargetCreator step?
Thanks in advance for the help.
In 2., do you mean that in the same run you can put as input more than one bam no? I mean, at the end, you should have only one "intervals" file instead of one file for each bam.
Correct, you'd end up with a single intervals file then.
https://www.broadinstitute.org/gatk/guide/best-practices#faqs_dnaseq-ovw3060 (Which data processing steps should I do per-lave vs. per-sample?)