Question: Calling SNPs with low coverage
gravatar for devin.porter92
3.3 years ago by
devin.porter920 wrote:

Does anyone know of a program that can call SNPs with low coverage? I have been using samtools mpileup and bcftools/vcftools to find high confident SNPs. I am now trying to identify unknown samples based off of the reference SNP panel I generated. Therefore, I don't need high confidence in calling SNPs in these unknown samples since I know the SNP exists and since I will compare hundreds of these SNPs unknown sample calls. I am dealing with coverage in matters of about 3-5x on average. I really appreciate any suggestions.


ADD COMMENTlink modified 2.7 years ago by pierre.peterlongo860 • written 3.3 years ago by devin.porter920

As long as you didn't do any prefiltering on your VCF file, you should have all the SNPs in there, from the highly to the lowly covered. In the DP sub-field of the INFO field you'll see it, and you can plot the distribution of it to have a better understanding of what your pipeline is calling (e.g. if low coverage SNPs are inside).

ADD REPLYlink written 3.3 years ago by Macspider3.1k

hello all.I want to use BBMap'callvariants to call variants.But where can I get the software(latest version)?could you give me the link?

ADD REPLYlink written 2.7 years ago by zhouyanghappy19890

You can download BBMap suite here.

ADD REPLYlink written 2.7 years ago by genomax87k

Thank you very much!

ADD REPLYlink written 2.7 years ago by zhouyanghappy19890

Could you give us a little bit more background? Do you have several samples at 3-5X? Are they from the same population? Coding regions?

ADD REPLYlink written 2.7 years ago by Gabriel R.2.7k
gravatar for pierre.peterlongo
2.7 years ago by
pierre.peterlongo860 wrote:

Hi there,

Maybe it's a bit late, but I'd like to highlight the discoSnp approach which might answer this initial question.

Without reference genome, discoSnp may predict SNPs and Indels from raw NGS reads. It does not depend on read alignment process and may find low covered variants. It removes all data seen less than c time. Thus just call discoSnp with -c 2, should answer the requirements (even if it'll miss variants seen only once).

Note that, during a final step, de novo predicted variants can be mapped on a genome, thus providing a VCF file that can be used for downstream analyses.

Best, Pierre

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by pierre.peterlongo860
gravatar for Brian Bushnell
3.3 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

BBMap has a variant-caller that is configurable to arbitrary depth or ploidy. You can use it like this: in=mapped.sam out=vars.vcf ref=reference.fasta clearfilters

The "clearfilters" flag clears ALL filters and will thus report all variants seen in the reads, regardless of depth or quality. Alternatively, you could use the flags "minreads=1 minscore=15" which would simply reduce the minimum number of reads and score a bit, or set the filters manually after reading the documentation. But probably for very-low coverage samples like you're using, since you have a set of known variants you're interested in, "clearfilters" is probably the best choice. BBMap also has another tool, used like this: in=sample.vcf,trusted.vcf out=intersection.vcf intersection

That will yield the lines from sample.vcf for variations contained in trusted.vcf.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Brian Bushnell17k

BBmap must have changed since this answer was provided. Do you know what's the updated command-line for this in BBMap_36.28.tar.gz?

ADD REPLYlink written 2.8 years ago by 141341254653464453.5k

Oh, that version is too old. CallVariants was not added until v36.55 (but I recommend the latest, 37.61).

ADD REPLYlink written 2.8 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 674 users visited in the last hour