Question: Same fastqs and pipeline get different variant results
0
gravatar for weibin2728
6 months ago by
weibin27280
weibin27280 wrote:

Helloļ¼

I'm trying to establish an analysis pipeline for panel sequencing to find somatic variants from paired normal and tumor fastqs. Now, the pipeline can run success, but there are some troubles about the final result.

Trouble:
The final variants in vcf is not reproducible with same fastqs and same pipeline between different runs. It means that when I run my pipeline several times with same fastqs and databases as input, the somatic variants in final vcf is not always all the same, variant number may be 100 in most times, but can also be 101 or 105 and so on in some occasionally time.

My pipeline brief description:
1) BWA mem for alignment;
2) Bam processing: sambamba for bam sort, picard for markdup, GATK for BQSR;
3) Mutect2 (GATK 4.1.0.0) for somatic variants calling.

PS:
1) All softs in pipeline are in docker image, so I'm pretty sure that the running environment is consistent between different runs;
2) When start from final bams for variant calling with Mutect2, the result is reproducible;
3) As BWA report multiple alignments randomly, I'm not filter reads with low mapping quality, because it seems that Mutect2 can filter these reads before processing bam. https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2;
4) AF of different variants is ranging from 2% to 10%.

Thank you for the help!

ADD COMMENTlink modified 6 months ago • written 6 months ago by weibin27280

I remember reading this study here that reports inconsistencies in the context of multithreading for the variant callers.

https://www.ncbi.nlm.nih.gov/pubmed/28233799

Also be sure that these variants are of high quality and not in any blacklisted regions or low-complexity regions.

ADD REPLYlink modified 6 months ago • written 6 months ago by ATpoint39k

Hi, thanks very much for your reply. I use same params betweens different runs, so the thread number is always consistent. May be I can find useful information in this paper. Blacklisted or low-complexity regions may be an important factor and I will check the location of different variants.

ADD REPLYlink written 6 months ago by weibin27280

Use bwa with -K option with exact same value everytime

 -K INT        process INT input bases in each batch regardless of nThreads (for reproducibility) []

This is well known behaviour

Bwa Mem Have Different Alignment Result When Using Different Threads

when bwa mem run with different thread, output.sam is different.

BWA-MEM get different results with different threads

Different results in bwa mem paired sequential and threaded versions

ADD REPLYlink modified 6 months ago • written 6 months ago by lakhujanivijay5.2k

Hi, thank you for your reply. As I use same params betweens different runs, the thread number is always consistent. Even so, the topics you list are valuable to me. Thanks.

ADD REPLYlink written 6 months ago by weibin27280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1012 users visited in the last hour