About GATK4 Mutect2 runtime (Whole Exome seq)
0
0
Entering edit mode
6 months ago
kwanghoon ▴ 20

Hi. I'm trying to call variant from WES data by Mutect2.

But its running time is 300~400 min per sample.

Is it normal or too long?

I thought it is too slow so I used "--native-pair-hmm-threads 32" option but it doesn't look like faster.

Thanks

GATK Whole Exome Sequencing Mutect2 • 501 views
3
Entering edit mode

6 hours per sample doesn't sound like much time at all, but time is rarely a measure of output quality. If the output looks OK, there is no need to worry about a mere 6h of computational time.

0
Entering edit mode

That sounds relief to me.

I open .vcf file there are no values at column Qual, Filer like below...

Is this not normal...?

Sorry I'm learning bioinfo, this is very difficult.

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S-140024000-4
chr1    13273   .   G   C   .   .   AS_SB_TABLE=139,84|63,40;DP=334;ECNT=1;MBQ=20,20;MFRL=155,155;MMQ=27,27;MPOS=28;POPAF=7.30;TLOD=216.58  GT:AD:AF:DP:F1R2:F2R1:SB    0/1:223,103:0.315:326:122,63:100,40:139,84,63,40
chr1    13417   .   C   CGAGA   .   .   AS_SB_TABLE=3,20|1,10;DP=44;ECNT=1;MBQ=24,24;MFRL=189,181;MMQ=23,24;MPOS=30;POPAF=7.30;RPA=2,4;RU=GA;STR;TLOD=39.61 GT:AD:AF:DP:F1R2:F2R1:SB    0/1:23,11:0.350:34:16,7:7,3:3,20,1,10
chr1    14677   .   G   A   .   .   AS_SB_TABLE=24,15|3,0;DP=47;ECNT=1;MBQ=20,29;MFRL=150,149;MMQ=25,25;MPOS=26;POPAF=7.30;TLOD=3.86    GT:AD:AF:DP:F1R2:F2R1:SB    0/1:39,3:0.119:42:17,1:22,2:24,15,3,0

1
Entering edit mode

That's odd. Can you add your entire Mutect2 command in a comment below?

0
Entering edit mode
gatk Mutect2 -R \$HG38 --native-pair-hmm-threads 32 -I input.bam -O unfiltered_output.vcf


This one

2
Entering edit mode

I don't understand what you're doing here - there is no matched normal, no panel of normals and no germline resource. Why use MuTect2 then, instead of, say, HaplotypeCaller?

Anyway, it looks like MuTect2 does not emit QUAL values. I found a thread on this here: https://sites.google.com/a/broadinstitute.org/legacy-gatk-forum-discussions/2020-01-07-2019-07-10/24416-mutect2-quality-column-in-the-vcf-file-is-empty

0
Entering edit mode
(ii) Tumor-only mode
This mode runs on a single type of sample, e.g. the tumor or the normal. To create a PoN, call on each normal sample in this mode, then use CreateSomaticPanelOfNormals to generate the PoN.

gatk Mutect2 \
-R reference.fa \
-I sample.bam \
-O single_sample.vcf.gz

To call mutations on a tumor sample, call in this mode using a PoN and germline resource. After FilterMutectCalls filtering, consider additional filtering by functional significance with Funcotator.

gatk Mutect2 \
-R reference.fa \
-I sample.bam \
--panel-of-normals pon.vcf.gz \
-O single_sample.vcf.gz


I saw this one. I thought upper command is right for me, but I read it not properly. So, I should use Haplotypecaller in my case? Thank you.

1
Entering edit mode

If you're working on a tumor sample, use MuTect2. If you're working on germline DNA, use Haplotype Caller. The latter assumes a diploid genome, the former has a bunch of cancer-specific algorithm changes.

I guess you're technically correct in using MuTect2 with just the tumor sample, but I don't know how efficient that is as a standalone analysis - you might need to follow up with more steps to filter the calls.

0
Entering edit mode

Thank you. It helps me a lot.

1
Entering edit mode

I am running WGS on my local linux server each sample needs 28 hours Mutect2. Human sample.

0
Entering edit mode

Thank you!! That sounds relief to me..