Question: Get all completely sequenced genomes from one genus
0
gravatar for bird77
7 months ago by
bird7720
bird7720 wrote:

Is there an automatic way to get the fasta sequences of all sequenced (preferably completely) genomes within a taxonomic group?

And how can I get the taxid for all of these organisms as well?

Thank you.

genome • 205 views
ADD COMMENTlink modified 6 months ago by tdmurphy110 • written 7 months ago by bird7720

For Ensembl there is no dedicated API way that I know of. If you are specifically interested in bacteria from Ensembl genomes here is a hackish script you can adapt.

ADD REPLYlink written 7 months ago by kloetzl960
0
gravatar for tdmurphy
6 months ago by
tdmurphy110
tdmurphy110 wrote:

This is easily accomplished from NCBI's Assembly resource: https://www.ncbi.nlm.nih.gov/assembly/?term=bacteria%5Borgn%5D+latest_refseq%5Bfilter%5D+complete_genome%5Bfilter%5D You can download FASTA, annotation, or other files using the big blue "Download Assemblies" button.

Note "complete genome" is a useful filter for bacteria, but there are only a handful of eukaryote assemblies that are sequenced to completion (mostly fungi). If you're interested in eukaryotes you may want to either focus on assemblies at the "chromosome" level (to exclude WGS assemblies that are just bags of scaffolds), or use the "exclude partial" filter to exclude the small number of assemblies that are focused on a subset of the genome (e.g. just one chromosome).

ADD COMMENTlink written 6 months ago by tdmurphy110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 613 users visited in the last hour