I am trying to annotate both coding and non-coding variants for information on COSMIC database using ANNOVAR. ANNOVAR doesn't provide direct support for the latest release of COSMIC due to licensing issues. Instead, they direct users to build their own ANNOVAR-database for COSMIC following these guidelines: http://annovar.openbioinformatics.org/en/latest/user-guide/filter/#cosmic-annotations
I am able to build the coding variants' database using the guideline, but not the non-coding ones. And the wording in the manual seems like suggesting that it is not possible to do it for non coding variants:
COSMIC changed their data formats so non-coding mutations are no longer in the MutantExport file, so we can no longer calculate their occurrences in various tumors. COSMIC now provides a CosmicNCV.tsv file, but it is not really that informative as the cancer tissue information is missing from this file.
Is there a way out to do the annotation for non-coding variants in COSMIC using ANNOVAR?
My failed attempt:
~/utils/annovar/prepare_annovar_user.pl --buildver hg19 -dbtype cosmic <(zcat CosmicNCV.tsv.gz) -vcf <(zcat CosmicNonCodingVariants.vcf.gz) > hg19_cosmicNonCoding80.txt 2> hg19_cosmicNonCoding80.log
Error: COSMIC MutantExport format error: column 17 should be 'Mutation ID'
EDIT: Cross-posted on ANNOVAR discussion board. Shall update if there is any lead. http://annovar.openbioinformatics.org/en/latest/user-guide/filter