I have been trying to understand from the ANNOVAR documentation and other sites the steps needed to make these files from NCBI available to ANNOVAR. I admit to being new to bioinformatics, but have been a software developer for 30+ years. My goal is to annotate by hg38 based VCF file but do not even have a lead as to what to do with the downloaded files. From what I can tell, convert2annovar doesn't support these.
I downloaded because all of the files on the ANNOVAR download site are almost a year old and I am trying to help determine the gene(s) for a genetic condition for which the genes are not yet known - so the latest and greatest information seemed appropriate.
- GCF_000001405.40_GRCh38.p14_genomic.fna
- genomic.gff
- protein.faa
- genomic.gtf
- rna.fna
- cds_from_genomic.fna
Thanks for any help. References to other resources would be most welcome. I am happy to learn, but have come up empty using web searches.
The documentation is clear on this:
convert2annovar.pl
is used to convert input (genotype) files to ANNOVAR format. It is not used on dictionary files or annotation sourcesDo you absolutely need to use ANNOVAR? For best results, I recommend using VEP.
Ok I guess I got confused because there is support for, for example gff3-solid, but not directly gff files. I will have a look at VEP, but the issue may simply be that I am not familiar enough with the files, purposes and formats. Thank you
You may also find tools such as vcfanno and bcftools easier to work with, if you don't want to spend too much time reformatting dictionary files.