Question: MuTect Seems to Miss a Lot of Mutations
1
gravatar for haiying.kong
3.3 years ago by
haiying.kong250
Germany
haiying.kong250 wrote:

I have whole exome data for paired samples of blood and tumor for 23 patients. I processed them with BWA(alignment), Picard(removing duplicates, sorting), GATK(local realignment, base quality score re-calibration etc.), then ran MuTect with STD mode to identify somatic mutations.
I also have Sanger sequence results for 3 genes.
So I compared the results from whole exome and Sanger, and found that some somatic mutations identified from Sanger are not appearing in the results from whole exome data. I think I might need to modify some parameters when I run MuTect. Which parameter can I adjust?
I have attached some IGV figures in the regions where MuTect has missed somatic mutations. Some are too obvious, and cannot understand why MuTect does not identify them as mutation. The input to IGV is bam files from GATK base quality score recalibration.
Thank you very much for any advice, suggestions, or hints.

        

      

snp • 1.6k views
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by haiying.kong250

What were your command-line parameters? The default mode, especially with matched tumour-normal is High Confidence, not STD. To run in STD mode you typically either need to turn off a bunch of filters or run with the --artifact_detection_mode flag. You should get pretty much everything picked up, plus tons of false positives in STD mode.

ADD REPLYlink written 3.3 years ago by Dan Gaston7.1k

This is the line for MuTect in my bash script:

/usr/bin/java -Xmx4g -Djava.io.tmpdir=tmp -jar ${MuTect}mutect-1.1.7.jar --artifact_detection_mode --analysis_type MuTect --reference_sequence ${GATK_hg} --cosmic ${COSMIC} --dbsnp ${dbSNP} --intervals ${Intervals} --input_file:normal ${BQSR_dir}${Normal}.recal.bam --input_file:tumor ${BQSR_dir}${Tumor}.recal.bam --out ${MuTect_dir}${Tumor}_${Normal}_call_stats.out --coverage_file ${MuTect_dir}${Tumor}_${Normal}_coverage.wig.txt --vcf ${MuTect_dir}${Tumor}_${Normal}.vcf

  Thank you so much for your quick reply.

ADD REPLYlink written 3.3 years ago by haiying.kong250

Hmmm, I'm not sure. Try calling just with the tumour sample in artifact_detection_mode and see if it is picked up at all that way? I am assuming you are looking in the raw VCF and not just a downstream analysis where variants with a filter flag set might not be reported?

ADD REPLYlink written 3.3 years ago by Dan Gaston7.1k

Yes, I am looking at VCF file and only those mutations tagged as PASSED. Which file should I look at?

How to do this "calling just with the tumour sample in artifact_detection_mode"?

Thank you so much for your help.

ADD REPLYlink written 3.3 years ago by haiying.kong250

You just pass a single input file -I:tumor [tumor.bam], just leave out the parameter for the normal sample.

ADD REPLYlink written 3.3 years ago by Dan Gaston7.1k
0
gravatar for haiying.kong
3.3 years ago by
haiying.kong250
Germany
haiying.kong250 wrote:

I made a mistake, and this could explain part of the results.

I think MuTect picked up the mutations in the first 3 figures, but the genomic location is different from Sanger by 1 base, and this is why I thought it did not pick up.

But for some others, some times there are mutations in more than 2 fragments or 2 coverage, MuTect did not pick up them as mutations, but Sanger identified them as mutations.

I understand that tumor tissues are highly heterogeneous, and it is possible that the mutation occurred on only a very small portion of the tumor cells.

But in any case, is there any parameter we could use for threshold when we decide somatic or not?

ADD COMMENTlink written 3.3 years ago by haiying.kong250

Yes, remember different genomic file formats have different counting styles, 0 versus 1-based counting.

I'm not sure what you mean though by "some times there are mutations in more than 2 fragments or 2 coverage"?

Also as a suggestion: Typically to update your question you can edit your original post and add new information at the bottom. You could also add it in the comments as well underneath but you shouldn't submit new information as an answer. Leave answers to be solely answers to the question.

ADD REPLYlink written 3.3 years ago by Dan Gaston7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 688 users visited in the last hour