Question: Download fungal genomes
gravatar for d.pinili
4.3 years ago by
d.pinili0 wrote:

Hi. I need to download all available fungal genomes for my community analysis using kraken (sequence classifier tool). It doesn't have any assistance for acquiring fungal database so i have to download myself. For custom database, the program needs genome sequences in fasta file and the header should contain gi number. I have tried looking in NCBI in the first place, but fungal genomes in the ftp (refseq and genbank folders in genomes/) do not contain gi numbers. I have also tried other websites aside ncbi but to no avail. Can someone help me? Thank you very much!

fungal genomes fungi ncbi • 2.9k views
ADD COMMENTlink modified 3.4 years ago by Hajk-Georg Drost140 • written 4.3 years ago by d.pinili0

NCBI is phasing out sequence GIs - use Accession.Version instead!

ADD REPLYlink written 4.3 years ago by Tanvir Ahamed 290

See for accession - gi mapping info.

You can parse the accessions from downloaded data and use e.g. sed to replace them with gis..

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by 5heikki8.9k
gravatar for Hajk-Georg Drost
3.4 years ago by
Hajk-Georg Drost140 wrote:

I know that this question is already quite old, but I now implemented a new package named biomartr that can perform bulk retrieval of genomes, proteomes, cds, gff, etc. Since the actual question is "Download fungal genomes" I will provide some biomartr based examples as a reference for people who in the future search for a way to bulk download all fungal genomes from NCBI RefSeq or Genbank.

To download all fungi genomes from NCBI RefSeq, one can simply type:

# download all fungi genomes from NCBI RefSeq
biomartr::meta.retrieval(kingdom = "fungi", db = "refseq", type = "genome")

Alternatively, genomes from NCBI Genbank can be retrieved by typing:

# download all fungi genomes from NCBI Genbank
biomartr::meta.retrieval(kingdom = "fungi", db = "genbank", type = "genome")

However, you are not limited to genomes. You can also download proteomes (type = "proteome"), coding sequences (type = "CDS"), and annotation files (type = "gff").

In case you wish to download only specific subgroups of fungi genomes, you can consult the getGroups() function to obtain available subgroups:

# retrieve available subgroups for the fungi kingdom
getGroups(db = "refseq", kingdom = "fungi")

"Ascomycetes" "Basidiomycetes" "Other Fungi"

We can now choose the group "Ascomycetes" and download the genomes of all fungi species that correspond to that group by typing:

# download all fungi genomes from NCBI RefSeq that belong to the subgroup Ascomycetes
meta.retrieval(kingdom = "fungi", group = "Ascomycetes", db = "refseq", type = "genome")

For more information please consult the Metagenome Retrieval Vignette. I hope it helps.

ADD COMMENTlink written 3.4 years ago by Hajk-Georg Drost140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1389 users visited in the last hour