I have whole exome data for paired samples of blood and tumor for 23 patients. I processed them with BWA(alignment), Picard(removing duplicates, sorting), GATK(local realignment, base quality score re-calibration etc.), then ran MuTect with STD mode to identify somatic mutations.
I also have Sanger sequence results for 3 genes.
So I compared the results from whole exome and Sanger, and found that some somatic mutations identified from Sanger are not appearing in the results from whole exome data. I think I might need to modify some parameters when I run MuTect. Which parameter can I adjust?
I have attached some IGV figures in the regions where MuTect has missed somatic mutations. Some are too obvious, and cannot understand why MuTect does not identify them as mutation. The input to IGV is bam files from GATK base quality score recalibration.
Thank you very much for any advice, suggestions, or hints.
What were your command-line parameters? The default mode, especially with matched tumour-normal is High Confidence, not STD. To run in STD mode you typically either need to turn off a bunch of filters or run with the --artifact_detection_mode flag. You should get pretty much everything picked up, plus tons of false positives in STD mode.
This is the line for MuTect in my bash script:
Thank you so much for your quick reply.
Hmmm, I'm not sure. Try calling just with the tumour sample in artifact_detection_mode and see if it is picked up at all that way? I am assuming you are looking in the raw VCF and not just a downstream analysis where variants with a filter flag set might not be reported?
Yes, I am looking at VCF file and only those mutations tagged as PASSED. Which file should I look at?
How to do this "calling just with the tumour sample in artifact_detection_mode"?
Thank you so much for your help.
You just pass a single input file -I:tumor [tumor.bam], just leave out the parameter for the normal sample.