Problem detecting indels
5 weeks ago
Estrella • 0

I´m working with a list of consensus sequences in order to detect variants. I´ve map the consensus to the references senquence using Needleall or BWA-MEM and calling variants with freebayes (v0.9.21). When I check SAM file I can detect indels, but in the vcf generated by freebayes, only SNPs are reported.

Example: AACACACAACAAGAA 0 HXB2_reference 1 255 2D211M1D2M4D9M3I3M2I240M

just because you found one read with a deletion doesn't mean that a variant will be generated. Now, it doesn't explain why there is not any indel withh freebayes.

Thank you Pierre. I agree. But, for example, in position 2 IGV detects 140 deletions (total consensus sequences: 56452). I´ve tested with the freebayes naive option, without better results :S

What is your freebayes command? Did you checked the default values. -F could be a start

Hi gb, Everytime I try with -F 0 , I obtain something similar to:

HXB2_reference 2 . GGAATTTTC GTACTTGTC,GTACTTGGC,TGTCTTTTC,TGACTTGTC,TGACTTGGC,TGACTTTGC,TGACTTTTC,TGAAGTTTC,TGAATTGTC,TGAATTGGC,TGAATTTGC,GGCAGTGGC,TTACTTGTC,CTACTTTTC,TTACTTTTC,GGACTTCTC,GGACTTGTC,GGACTTGGC,GGACTTTGC,GC,GGC,TTAATTTTC,GTAATTTTC,AGAATTTTC,TGAATTTTC,GGCATTTTC,GGGATTTTC,GGTATTTTC,GGACTTTTC,GGAATTATC,GGAATTCTC,GGAATTGTC,GGAATTTCC,GGAATTTGC,GGAATTTTA

(and no more variants)

The default value for F is 0.2, and, if I am rigth, it suppose that, in a certain position, if I have > 13 deletions (for a total of 56452 consensus sequences), it might be shown, isn´t it?

In the same way, I´ve tried with -C 1, with and without -F 0. In the first option, I obtain a list of variants, but any of them is a indel.

One question: if in a position freebayes detect SNP and indels, it can return both of them?

Thank you so much

No? 0.2=20% 13 of 56452 is like 0.02%

One question: if in a position freebayes detect SNP and indels, it can return both of them?

Yes

You can also check out vcfallelicprimitives https://github.com/vcflib/vcflib/blob/master/doc/vcfallelicprimitives.md

Mostly for variantcalling you map reads instead of consensus sequences so not sure if that has an influence on the algoritm.