Question: LoFreq outputing no SNP?
2
gravatar for manekineko
18 months ago by
manekineko130
Bulgaria
manekineko130 wrote:

Hi, I'm trying to use LoFreq for viral data (BAM), when I know there is a low-frequency SNP for example at a particular position 30.000 reads with original nt and 1500 reads containing SNP which makes ~3% frequency of a viral variant, but LoFreq is outputting no SNP found? Can someone with more experience on viral data can suggest options how to run the tool?

lofreq • 710 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 18 months ago by manekineko130

Are you sure those reads are of reasonable quality? How did you determine that there are 1,500 of 30,000 variant reads?

ADD REPLYlink written 18 months ago by igor7.7k

I'm exploring this in IGV, I have explored the Phred quality of the sample and see that plot shows quality above 35.

ADD REPLYlink written 18 months ago by manekineko130

The base quality or the mapping quality? Do the actual reads with the variant look okay?

ADD REPLYlink written 18 months ago by igor7.7k

I have run the BAM in Geneipus and found maybe the reason why in some tools they are disappearing. For example, the one of the SMP are mostly from reverse reads, so if I checked options for strand-bias the SNP probably are trow out. But in my case, I do not know what should be the option as the sample is from a single strand RNA virus sample.

Example:
Type: Polymorphism
Length: 1
Bases: U
Change: T -> A
Reference Nucleotide(s): T
Reference Frequency: 96.2%
Coverage: 25,331
Variant Nucleotide(s): A
Variant Frequency: 3.5%
Variant Raw Frequency: 881
Strand-Bias: 99.9%
Variant P-Value (approximate): 0.0
Strand-Bias >50% P-value: 1.1 × 10 -262
Strand-Bias >80% P-value: 1.9 × 10 -83
Polymorphism Type: SNP (transversion)
Average Quality: 37
ADD REPLYlink modified 18 months ago • written 18 months ago by manekineko130

Looks like you have a strand bias issue:

the strand-bias test checks whether the proportion of bases on forward and reverse strand is different from the proportion of alternate bases on forward and reverse strand (using Fisher's exact test). If most reference bases are on one strand and most alternate bases are on the same strand then there's no bias

From: https://sourceforge.net/p/lofreq/discussion/general/thread/ee151ab0/

It shouldn't matter that it's from stranded RNA-seq. The variant frequency should not be different depending on strand. If most reads are one strand, most variant reads should be on the same strand.

You should check some higher frequency variants and see what the reported strand bias is for those.

ADD REPLYlink modified 18 months ago • written 18 months ago by igor7.7k

The strange thing is that even with the option "Disable use of base-alignment quality (BAQ)" LoFreq does not report any SNP

ADD REPLYlink written 18 months ago by manekineko130

You could try without the automatic lofreq filtering:

--no-default-filter
ADD REPLYlink written 17 months ago by Joseph Hughes2.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1270 users visited in the last hour