I'm using some tools such as Pindel to call structural variants from exome data. Since exome is sparse region with limited information, I'm just looking for those large indels (say 200bp), which is small enough to may have breakpoints within exomes, and big enough to be missed by SNP-centric algorithms like GATK. Due to BWA's inability to well handle multiple-alignment in low-complexity region, I'm trying to do de novo assembly around all called breakpoints using Abyss, in order to exlude possible false positives.
This propels me to think: why not simply assemble the whole exome?
- I know some available assemblers like Abyss, Velvet. But any algorithm specifically calling variants based on de novo assembly?
- How much RAM do I need to assemble the whole exome?
- Any tricks for assembling exome sequences which makes it different for whole genome? I mean would it be desirable to put exome sequences into those algorithms designed for whole-genome assembly?