Question

Same fastqs and pipeline get different variant results

0

Entering edit mode

4.1 years ago

weibin2728 • 0

Hello！

I'm trying to establish an analysis pipeline for panel sequencing to find somatic variants from paired normal and tumor fastqs. Now, the pipeline can run success, but there are some troubles about the final result.

Trouble:
The final variants in vcf is not reproducible with same fastqs and same pipeline between different runs. It means that when I run my pipeline several times with same fastqs and databases as input, the somatic variants in final vcf is not always all the same, variant number may be 100 in most times, but can also be 101 or 105 and so on in some occasionally time.

My pipeline brief description:
1) BWA mem for alignment;
2) Bam processing: sambamba for bam sort, picard for markdup, GATK for BQSR;
3) Mutect2 (GATK 4.1.0.0) for somatic variants calling.

PS:
1) All softs in pipeline are in docker image, so I'm pretty sure that the running environment is consistent between different runs;
2) When start from final bams for variant calling with Mutect2, the result is reproducible;
3) As BWA report multiple alignments randomly, I'm not filter reads with low mapping quality, because it seems that Mutect2 can filter these reads before processing bam. https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2;
4) AF of different variants is ranging from 2% to 10%.

Thank you for the help!

sequencing alignment next-gen SNP • 1.2k views

ADD COMMENT • link 4.1 years ago by weibin2728 • 0

0

Entering edit mode

I remember reading this study here that reports inconsistencies in the context of multithreading for the variant callers.

https://www.ncbi.nlm.nih.gov/pubmed/28233799

Also be sure that these variants are of high quality and not in any blacklisted regions or low-complexity regions.

ADD REPLY • link 4.1 years ago by ATpoint 82k

0

Entering edit mode

Hi, thanks very much for your reply. I use same params betweens different runs, so the thread number is always consistent. May be I can find useful information in this paper. Blacklisted or low-complexity regions may be an important factor and I will check the location of different variants.

ADD REPLY • link 4.1 years ago by weibin2728 • 0

0

Entering edit mode

Use bwa with -K option with exact same value everytime

 -K INT        process INT input bases in each batch regardless of nThreads (for reproducibility) []

This is well known behaviour

Bwa Mem Have Different Alignment Result When Using Different Threads

when bwa mem run with different thread, output.sam is different.

BWA-MEM get different results with different threads

Different results in bwa mem paired sequential and threaded versions

ADD REPLY • link 4.1 years ago by lakhujanivijay 5.8k

0

Entering edit mode

Hi, thank you for your reply. As I use same params betweens different runs, the thread number is always consistent. Even so, the topics you list are valuable to me. Thanks.

ADD REPLY • link 4.1 years ago by weibin2728 • 0