Question: Calling SNPs with low coverage
gravatar for devin.porter92
14 months ago by
devin.porter920 wrote:

Does anyone know of a program that can call SNPs with low coverage? I have been using samtools mpileup and bcftools/vcftools to find high confident SNPs. I am now trying to identify unknown samples based off of the reference SNP panel I generated. Therefore, I don't need high confidence in calling SNPs in these unknown samples since I know the SNP exists and since I will compare hundreds of these SNPs unknown sample calls. I am dealing with coverage in matters of about 3-5x on average. I really appreciate any suggestions.


ADD COMMENTlink modified 7 months ago by pierre.peterlongo760 • written 14 months ago by devin.porter920

As long as you didn't do any prefiltering on your VCF file, you should have all the SNPs in there, from the highly to the lowly covered. In the DP sub-field of the INFO field you'll see it, and you can plot the distribution of it to have a better understanding of what your pipeline is calling (e.g. if low coverage SNPs are inside).

ADD REPLYlink written 14 months ago by Macspider2.4k

hello all.I want to use BBMap'callvariants to call variants.But where can I get the software(latest version)?could you give me the link?

ADD REPLYlink written 7 months ago by zhouyanghappy19890

You can download BBMap suite here.

ADD REPLYlink written 7 months ago by genomax49k

Thank you very much!

ADD REPLYlink written 7 months ago by zhouyanghappy19890

Could you give us a little bit more background? Do you have several samples at 3-5X? Are they from the same population? Coding regions?

ADD REPLYlink written 7 months ago by Gabriel R.2.4k
gravatar for Brian Bushnell
13 months ago by
Walnut Creek, USA
Brian Bushnell15k wrote:

BBMap has a variant-caller that is configurable to arbitrary depth or ploidy. You can use it like this: in=mapped.sam out=vars.vcf ref=reference.fasta clearfilters

The "clearfilters" flag clears ALL filters and will thus report all variants seen in the reads, regardless of depth or quality. Alternatively, you could use the flags "minreads=1 minscore=15" which would simply reduce the minimum number of reads and score a bit, or set the filters manually after reading the documentation. But probably for very-low coverage samples like you're using, since you have a set of known variants you're interested in, "clearfilters" is probably the best choice. BBMap also has another tool, used like this: in=sample.vcf,trusted.vcf out=intersection.vcf intersection

That will yield the lines from sample.vcf for variations contained in trusted.vcf.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Brian Bushnell15k

BBmap must have changed since this answer was provided. Do you know what's the updated command-line for this in BBMap_36.28.tar.gz?

ADD REPLYlink written 7 months ago by 141341254653464453.4k

Oh, that version is too old. CallVariants was not added until v36.55 (but I recommend the latest, 37.61).

ADD REPLYlink written 7 months ago by Brian Bushnell15k
gravatar for pierre.peterlongo
7 months ago by
pierre.peterlongo760 wrote:

Hi there,

Maybe it's a bit late, but I'd like to highlight the discoSnp approach which might answer this initial question.

Without reference genome, discoSnp may predict SNPs and Indels from raw NGS reads. It does not depend on read alignment process and may find low covered variants. It removes all data seen less than c time. Thus just call discoSnp with -c 2, should answer the requirements (even if it'll miss variants seen only once).

Note that, during a final step, de novo predicted variants can be mapped on a genome, thus providing a VCF file that can be used for downstream analyses.

Best, Pierre

ADD COMMENTlink modified 7 months ago • written 7 months ago by pierre.peterlongo760
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1395 users visited in the last hour