Question: Retrieve multiple refseq genomes in seperate fasta files
0
gravatar for Biogeek
8 months ago by
Biogeek400
Biogeek400 wrote:

I've had a search and I can't seem to find any relatable questions.

My task is as follows:

  1. I have a list of Refseq accessions in a .txt file.
  2. I want to download all the associated genomes to seperate .fasta files in a local directory.

I note that I can use Entrez or NCBI assembly downloader, but this puts all genomes into the one .fasta file which isn't ideal.

Can anyone help?

Thanks.

refseq genome download ncbi • 148 views
ADD COMMENTlink written 8 months ago by Biogeek400
1
gravatar for GenoMax
8 months ago by
GenoMax96k
United States
GenoMax96k wrote:

I note that I can use Entrez or NCBI assembly downloader, but this puts all genomes into the one .fasta file which isn't ideal.

How about using a loop and multiple calls to the said programs. That should give you separate files.

ADD COMMENTlink written 8 months ago by GenoMax96k

Good call genomax, thanks! I've now installed the Entrez utilities and can obtain my record with efetch. I'll write a loop using the 'list.txt' file I have which contains accession numbers.

One more question, apologies for the ignorance (as required), is there a way I can also obtain my .fasta files with the TaxId on the headers as well?

Thanks!

ADD REPLYlink written 8 months ago by Biogeek400

If you use Entrezdirect then use epost method instead of a loop. It will do the same thing. You will need to post-process the files to add taxID to headers. I don't think there is a way to do this automatically.

ADD REPLYlink written 8 months ago by GenoMax96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1770 users visited in the last hour
_