LoFreq outputing no SNP?
0
2
Entering edit mode
4.7 years ago
manekineko ▴ 150

Hi, I'm trying to use LoFreq for viral data (BAM), when I know there is a low-frequency SNP for example at a particular position 30.000 reads with original nt and 1500 reads containing SNP which makes ~3% frequency of a viral variant, but LoFreq is outputting no SNP found? Can someone with more experience on viral data can suggest options how to run the tool?

LoFreq • 1.9k views
0
Entering edit mode

Are you sure those reads are of reasonable quality? How did you determine that there are 1,500 of 30,000 variant reads?

0
Entering edit mode

I'm exploring this in IGV, I have explored the Phred quality of the sample and see that plot shows quality above 35.

0
Entering edit mode

The base quality or the mapping quality? Do the actual reads with the variant look okay?

0
Entering edit mode

I have run the BAM in Geneipus and found maybe the reason why in some tools they are disappearing. For example, the one of the SMP are mostly from reverse reads, so if I checked options for strand-bias the SNP probably are trow out. But in my case, I do not know what should be the option as the sample is from a single strand RNA virus sample.

Example:
Type: Polymorphism
Length: 1
Bases: U
Change: T -> A
Reference Nucleotide(s): T
Reference Frequency: 96.2%
Coverage: 25,331
Variant Nucleotide(s): A
Variant Frequency: 3.5%
Variant Raw Frequency: 881
Strand-Bias: 99.9%
Variant P-Value (approximate): 0.0
Strand-Bias >50% P-value: 1.1 × 10 -262
Strand-Bias >80% P-value: 1.9 × 10 -83
Polymorphism Type: SNP (transversion)
Average Quality: 37

0
Entering edit mode

Looks like you have a strand bias issue:

the strand-bias test checks whether the proportion of bases on forward and reverse strand is different from the proportion of alternate bases on forward and reverse strand (using Fisher's exact test). If most reference bases are on one strand and most alternate bases are on the same strand then there's no bias

It shouldn't matter that it's from stranded RNA-seq. The variant frequency should not be different depending on strand. If most reads are one strand, most variant reads should be on the same strand.

You should check some higher frequency variants and see what the reported strand bias is for those.

0
Entering edit mode

The strange thing is that even with the option "Disable use of base-alignment quality (BAQ)" LoFreq does not report any SNP

0
Entering edit mode

You could try without the automatic lofreq filtering:

--no-default-filter