I'm trying to establish an analysis pipeline for panel sequencing to find somatic variants from paired normal and tumor fastqs. Now, the pipeline can run success, but there are some troubles about the final result.
The final variants in vcf is not reproducible with same fastqs and same pipeline between different runs. It means that when I run my pipeline several times with same fastqs and databases as input, the somatic variants in final vcf is not always all the same, variant number may be 100 in most times, but can also be 101 or 105 and so on in some occasionally time.
My pipeline brief description:
1) BWA mem for alignment;
2) Bam processing: sambamba for bam sort, picard for markdup, GATK for BQSR;
3) Mutect2 (GATK 18.104.22.168) for somatic variants calling.
1) All softs in pipeline are in docker image, so I'm pretty sure that the running environment is consistent between different runs;
2) When start from final bams for variant calling with Mutect2, the result is reproducible;
3) As BWA report multiple alignments randomly, I'm not filter reads with low mapping quality, because it seems that Mutect2 can filter these reads before processing bam. https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2;
4) AF of different variants is ranging from 2% to 10%.
Thank you for the help!