Hi everybody,
I need to download sequences (fasta) with their annotation data (gff3) from ncbi based on their accession number. I've used entrez efetch for that job and retrieved data in asn.1 and converted to fasta and gff3 with asn2fasta and annotwriter from ncbi c++ toolkit. However for some Refseq records, the raw sequence information is not part of the asn.1 record and the asn2fasta needs to download it from some ncbi webservice. However it takes ages, compared to plain efetch.
For example, it takes efetch 1.3 seconds to download fasta sequences for these two refseq accessions "NW_003726435.1, NW_003729148.1", while asn2fasta, with asn.1 records already obtained in the file takes about 40 seconds (for one sequenece about 37 seconds).
Do anybody have any idea, why the asn2fasta is so slow, and/or how to make it run faster?
Best regards
This really is a question for NCBI help desk. Be aware that it may take 2-3 business days to get an answer from them but be patient. Come back and post the official response here when you get one.