Tool:all2vcf: a tool to convert non-standard variant outputs (mummer. bcftools isec) to VCF
0
3
Entering edit mode
11 months ago
Macspider ★ 3.5k

Variant calling is a common strategy to analyze genome differences but in many cases the tools used produce outputs that aren't formatted with the standard VCF format. Two examples are bcftools isec and the show-snps utility from the MUMmer package.

My tool is a toolkit to convert these non-standard outputs to VCF. For now, it can convert the output of bcftools isec to VCF or the output of show-snps -T to VCF. I plan to add other non-standard formats in the future, based on needs and suggestions.

all2vcf isec

This utility processes the output of bcftools isec. The latter command is used to intersect VCF files (provided as input) and provides a series of output files, among which a file called sites.txt that represents the intersection sites of the input VCFs. This file is not in VCF format, which means it's hard to use it in a genome browser. Sometimes however one wants to see the intersected variants in a graphical interface. Hence, with all2vcf isec you can convert this file to a VCF file retaining some of the most relevant information in it. Of course you will lose every sample-specific info such as genotype, coverage, MQ0F and other VCF-related numbers. Those you can always look up on the original files. You will, however, be able to see the shared variants between multiple VCF files in a standard VCF-format.

all2vcf mummer

This utility processes the output of nucmer | show-snps -T. The latter command is used on the output of nucmer or MUMmer, which is a delta file. By using show-snps -T *.delta one can obtain the SNPs from the delta mapping file. However, the result is a tab-separated file that has no standardization and requires manual handling to extract information from it. With all2vcf mummer you can directly convert this format to a VCF standard file, and then analyze it with existing VCF analysis tools. As for the isec utility, you will lose every sample-specific info but you can always look that up on the original files.

You can clone the tool from its Github Repository: https://github.com/MatteoSchiavinato/all2vcf

I hope it will be helpful for many people! And since it's relatively new, if you find issues with it please open them and let me improve it :)

mummer isec variant bcftools VCF • 439 views