Hello everyone, I am currently trying to learn about ngs data analysis. Specifically, I have data from targeted (panel) ngs from human liquid biopsy samples, in which we have to determine the presence of somatic variants that are at a very low allele fraction (ctDNA in peripheral blood). The company that sold us this service conveniently forgot to mention that they were not going to hand us the final results, so it's up to us to decide, based on the BAM files they provided us with, which mutations could be considered valid and which might be artifacts or errors.... This is all very new to me, so I have lots of doubts. They are offering support, but there are some things that I can't seem to fully understand.
The thing is, when I look at specific positions in the BAM file, take KRAS codon 12 as an example, I can see that there is a variant in that position, but MAPQ values for the amplicons that support the position are lower than MAPQ values for amplicons that support the reference (18 vs 33). The company's support division explained to me that those differences in MAPQ could indicate that the variant is not actually real and that it should be confirmed by another method (confirming variants is not an option for us). Biologically speaking, since we are dealing with a cancer patient sample, and giving that KRAS codon 12 is a very common and important mutation, we thought: what are the chances that this specific mutation was produced by an error?
So, I decided to make some processing of my own, since they provided the raw data files, always focusing on KRAS codon 12 to compare results:
- First of all, I tried aligning with BWA. Results were great, KRAS mutated reads had MAPQ of 60, same as reference reads
I then aligned the data using Bowtie2 (this is what the company uses) in end-to-end mode. Again, results were great (98,9% overall rate alignment), and no difference in MAPQ between mutated/reference reads (MAPQ 42)
Finally, I got support people to tell me the exact command that they had used to get to the final BAM file, and realized that they were using local alignment mode. Run the alignment in this mode (rate alignment was raised a little bit, I think around 99,2%, cannot remember), and finally got similar BAM to the company's BAM: MAPQ 18 vs 33.
So, I wanted an expert opinion on this, since, I repeat, this is all very new to me. Is it advantageous to perform local alignment in targeted ngs?? I cannot understand how local aligment can hurt MAPQ so much for these amplicons given that there seems to be no soft-clipped bases in the alignment.
I'm sorry if it got too long, want to be as detailed as possible. Hope you can help me
Thanks in advance!