Question: Variant Calling In Samtools And Freebayes
1
gravatar for sorrymouse
5.5 years ago by
sorrymouse80
sorrymouse80 wrote:

I've used both on the same input files, with essentially default parameters. I am using multiple samples that are supposedly from two populations.

freebayes --fasta-reference reference.fa reads.bam > out.vcf

samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf
bcftools view var.raw.bcf | bcftools view var.raw.bcf > var.flt.vcf

Samtools identifies snps that freebayes does not, and its very strange. For example, one very strong snp (quality of 999, looks good in tview in all samples) does not show up in freebayes. What is weird is that it does show up if you only process one population into vcd, the population that is almost entirely homozygous reference. The second population, which is fixed homozygous alternate, if included in the analysis means the SNP doesn't show up. Which is really weird.

I have no idea what is different about these snp calling methods such that something like this would happen, it is not a snp that there is much doubt about. Does anyone know?

samtools • 5.6k views
ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by sorrymouse80
3
gravatar for Daniel Swan
5.5 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

First of all if you want help to assess the call posting the mpileup at that position might help along with the relevant lines from the VCF's

However it is a well-known fact that SNP and indel callers are far more discordant than you would first expect:

http://genomemedicine.com/content/5/3/28/abstract

You may also like this post from Brad Chapman on the topic:

http://bcbio.wordpress.com/2013/05/06/framework-for-evaluating-variant-detection-methods-comparison-of-aligners-and-callers/

ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Daniel Swan13k

Hi Daniel,

I am just wondering you use freebayes to call variants from RNAseq data, if so, which aligner you found is the best companion for it? Thanks, 

ADD REPLYlink written 5.2 years ago by yyaobo30
0
gravatar for sorrymouse
5.5 years ago by
sorrymouse80
sorrymouse80 wrote:

Here is the variant I wrote about: chr . A C 999 . . GT:PL:GQ 1/1:255,81,0:78 1/1:255,148,0:99 ./.:0,0,0:3 1/1:255,90,0:87 ./.:0,0,0:3 1/1:119,24,0:21 1/1:255,181,0:99 0/1:255,0,255:99 ./.:0,0,0:3 0/0:0,66,255:63 0/0:0,66,255:63 0/0:0,42,255:39 0/0:0,196,255:99 0/0:0,42,232:39

And here is a similar variant with a much lower quality that freebayes always calls: chr . A C 7.15 . . GT:PL:GQ 0/0:0,187,255:99 0/0:0,250,255:99 0/0:0,96,255:99 0/0:0,99,255:99 0/0:0,0,0:12 0/0:0,126,255:99 0/0:0,90,255:99 0/0:0,255,255:99 0/0:0,0,0:12 0/0:0,163,255:99 0/0:0,151,255:99 0/0:0,36,255:48 0/0:0,255,255:99 0/1:47,0,149:35

I'm aware that there are discrepancies among snp callers, I thought there might be answer here related to how freebayes deals with its assumptions about population structures and haplotypes or something.

ADD COMMENTlink written 5.5 years ago by sorrymouse80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1996 users visited in the last hour