Hi all,
I am new in snp calling things and .vcf genenating as well. I have 37 soybean genotypes that I've already filtered by quality, removed dup etc...
At first I've ran the RealignerTargetCreator for all the samples, using this command:
java -Xms4g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator \
-R /indice/Gmax_275_v2.0.fa -I Sample1_qfilter_sorted_rmdup.bam \
-I Sample2_qfilter_sorted_rmdup.bam -I Sample3_qfilter_sorted_rmdup.bam \
-I Sample4_qfilter_sorted_rmdup.bam ......... -o realignment_targets.list
Then later I've generated the big bam file:
java -jar GenomeAnalysisTK.jar -T IndelRealigner \
-R /indice/Gmax_275_v2.0.fa -I Sample1_qfilter_sorted_rmdup.bam \
-I Sample2_qfilter_sorted_rmdup.bam -I Sample3_qfilter_sorted_rmdup.bam \
-I 4_qfilter_sorted_rmdup.bam ...... targetIntervals realignment_targets.list -o realigned_reads.bam
Both processes above took me like almost 10 days. Now I am trying to run HaplotypeCaller and generate the raw_variants.vcf file, using this command:
java -Xmx10g -jar GenomeAnalysisTK.jar -T HaplotypeCaller \
-R /indice/Gmax_275_v2.0.fa -I realigned_reads.bam -o raw_variants.vcf
but it says that it will take 73 weeks. So I need to figure out how to make it faster.
I saw that I could kind of divide it and run many HaplotypeCaller processes in parallel but I have no idea of how it would be the command for that.
Could you help me with that, please?
Thanks in advance!
HaploTypeCaller supports multiple types of parallelization, take a look at https://gatkforums.broadinstitute.org/gatk/discussion/1975/how-can-i-use-parallelism-to-make-gatk-tools-run-faster and https://software.broadinstitute.org/gatk/documentation/article.php?id=1988
So now I am using -nct 8. Let's see how it goes. Thank you.
Hi, have you managed to use -nct option with HaplotypeCaller? I am getting an error "n is not a recognized option"
GATK4 is being released this month and is if I remember correctly much faster.
GATK 4 has been officially released last evening already! :-)
I was not wrong, but also not very accurate ;-)