False negative SNPs and depth of coverage with sam/bcftools
Entering edit mode
8.9 years ago

I'm developing a variant calling pipeline for SNP detection in yeast and have 3 biological samples from yeast WGS which suppose to have 100% of SNPs in common. The raw calls have 80% SNPs overlap and the filtering on read quality, map quality, read depth, etc. make things only worse (~76-60% of common SNPs depending on the filter). When you look at the calls unique for each samples you can see that they are also present in the other samples but at the lower depth and haven't been called because of that. The sequencing depth of these samples is similar (x20, x20 and x19) but there are some regions where depth varies which causes a lot of false negatives.

My command line is:

samtools mpileup -d 8000 -Euf yeast.fasta sample1.bam  | bcftools call -vcO z -o sample1.vcf.gz

I haven't done much variant detection analysis and don't know if this a well-known problem, and google didn't show anything. Is there a way to get around this problem? How common is that and what do people do with this issue?

SNP bcftools samtools • 2.3k views

Login before adding your answer.

Traffic: 1750 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6