I have called variants using a pipeline consisting of samtools mpileup, bcftools call and bcftools filter to obtain a VCF file containing SNPs and short INDELS.
I would like to annotate the SNPs and INDELS in my VCF file to predict the effect of function of SNPs. From my understanding, most programs require that the variant headers in the VCF file have chromosome names that match the annotation file or database.
I'm working with SNPs and INDELS called from a de novo transcriptome assembled by Trinity, therefore the variant calls in my VCF file look like this:
TRINITY_DN165715_c0_g1_i1 349 . A G 91 PASS DP=11;VDB=0.746774;SGB=-0.676189;MQSB=0.0297172;MQ0F=0;AC=2;AN=2;DP4=0,0,6,5;MQ=17 GT:PL 1/1:121,33,0
Is there a script that I can use to reformat my variant headers to a more generic format used by variant annotation programs?
Any info would be greatly appreciated.