Variant Calling From Single Cell Sequencing Data?
Entering edit mode
10.6 years ago
Sangwoo Kim ▴ 440

Are there any tools for variant calling from single cell sequencing data? The question includes SNPs, Somatic mutations, CNVs, SVs. I thought the data will be similar once generated and processed to BAM files, but it might be necessary to consider systematic bias from the single cell sequencing processes (e.g. MDA).

variant sequencing • 4.0k views
Entering edit mode
10.6 years ago
Erik Garrison ★ 2.4k

I'm currently working on a method to do so, based on freebayes. Please bear with me.

The good news is that you can probably get a large part of the way there already with freebayes and a single option. Adding --pooled-discrete will remove any assumptions about genotype frequencies that might bias your results away from the patterns expected in a clonally-evolving population.

If you are concerned about non-independence of reads due to the limited (max 2x!) input to the amplification, you can also adjust the --read-dependence-factor. Set it lower (e.g. 0.8? 0.7?) to approximate less independence between reads. I don't know how effective it will prove to be in this context, and I'd like to implement a better solution to this problem by directly estimating the parameter form the sequence data. Still, this should provide a basic method to correct for assumptions of independence. It's already used by default in practice (@ 0.9). If you're interested in this issue please see this paper on the topic: The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.

If you include your germline in the analysis, you can use a tools in vcflib (vcfsamplediff) to tag putative somatic variants and add a somatic score (SSC) provided you have genotype qualities (--genotype-qualities in freebayes).

Feel free to contact me by email to further discuss. I'm curious what you come up with.

Entering edit mode

Hi Erik, could I prod you gently about this? I've been trying DELLY on single cell PE libraries, and not getting very much. My guess is that the frequent chimeric or otherwise weird pairs are helping to cover up real deletions and translocations. But I could be wrong. I'll try these suggestions and report back, but if you've added anything specifically for single cell analysis to freebayes, could you post here to let us know?


Login before adding your answer.

Traffic: 2188 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6