I annotated a VCF file using VEP and noticed that it reports several variant IDs to each input variant. For example, this is an excerpt of one of the variant lines (I removed the annotation info that is not relevant to the question):
12 25398284 . C A . . rs121913529&COSV55497369&COSV55497419&COSV55497479
As you can see, for this variant three diferent COSMIC ids are reported although only one of them (COSV55497419) corresponds to the alternate allele that is found (C>A). The rest of ids refer to other alternate alleles that can also be found at that position.
After reading VEP's documentation I know this is the expected behavior, but I am kind of confused about the following lines. I am not really sure I understand what it is refering to as "variants with unknown alleles":
For some data sources (COSMIC, HGMD), Ensembl is not licensed to redistribute allele-specific data, so VEP will report the existence of co-located variants with unknown alleles without carrying out allele matching. To disable this behaviour and exclude these variants, use the
Just in case, I repeated the annotation using the
--exclude_null_alleles flag but the output for the ids is now blank for COSMIC, only the dbSNP code is reported.
So basically I would like to have only the specific COSMIC id for my variant. Does anyone know how can I perform the annotation with VEP so it only reports the specific COSMIC id of the alternate allele that is present in my VCF?
Thanks a lot for reading!!!