I have a vcf file from a single cell experiment. I got the basic statistics using vcf-stat utility in vcf-tools. I am interested in the distribution of indels. How can i get a barplot of genome-wide indel distribution (number of indels per chromosome) as well as a plot of number of insertions/deletions versus size of insertion/deletion? Is there a tool that I can use ?
The VCF will have all of this information. One way to summarize all of your data would be to use
grep to thin your VCF down to just indels, then read the (tab delimited) result into R and plot whatever you want. Note that, per record, you'll have a chromosome, start position, and stop position, which should give you everything you'll need. Other tools might be able to handle this, but it's just basic text parsing to get what you want.