I'm currently updating my Variant Calling Pipeline by switching the VCF annotating software from Annovar to VEP for a variety of regions, not least how easy it is to annotate with HGVS notation and keep datasets up to date in VEP.
For the most part everything is running smoothly, with the exception that some of the data in the VCFs is lost during annotation (and conversion to tsv). The VCFs are created with GATK's UnifiedGenotyper and include a 'Format' column where each value is 'GT:AD:DP:GQ:PL' and a column named after the Individual, which contains semicolon-separated data that corresponds to the Format column (i.e. Genotype;Allele Depth;Depth;Genotype Quality;Phred-likelihood). When I annotate with VEP none of this data is carried over to the output file as it would be in Annovar, leaving me with an annotated file that has no information on read depth, genotype or any of the other data in the two lost columns.
I've included the command I'm currently using for annotation:
./vep -i RM0108.vcf --cache --force_overwrite --tab --merged --variant_class --sift b --polyphen b --hgvs --symbol --canonical --check_existing --af_1kg --af_gnomad --humdiv --pick -o RM0108.tsv
I can't find information on this in the VEP documentation or elsewhere online. I could write something to take the relevant information from the VCF and add it to the tsv after VEP has finished running, but it seems like there may be an easier solution that I'm missing, so any help would be appreciated.
I've also posted this question on the Bioinformatics StackOverflow Link Here