3.6 years ago
spiral01 ▴ 100

I am trying to download the drosophila database to annotate variants with annovar. I have followed the instructions found here: http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ using the following commands:

annotate_variation.pl -downdb -buildver dm6 gene drosdb

annotate_variation.pl --buildver dm6 --downdb seq drosdb/dm6_seq

retrieve_seq_from_fasta.pl drosdb/dm6_refGene.txt -seqdir drosdb/dm6_seq -format refGene -outfile drosdb/dm6_refGeneMrna.fa


However, after the second command results in some of the annotation databases failing to download. Here is the output:

NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done


And consequently the third command gives me these errors for each position:

WARNING: Cannot identify sequence for NR_124579 (starting from chr2R:16812244)


Which results in a fasta file with 0 genomic regions.

I tried going directly to the FTP site to manually download the files, but as their path is different, I don't know how to address this problem. Many thanks.

3.6 years ago

I think that this is an issue with ANNOVAR. The files that it looks to download do not exist (except for dm6.fa.gz). Take a look here: ftp://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/

I would contact Kai Wang to explain this issue. He may have to fix it in the next release of ANNOVAR. http://annovar.openbioinformatics.org/en/latest/ http://wglab.org/extra/contact

An update to this for anyone else with this problem:

Kevin