1
0
Entering edit mode
6.2 years ago
d.pinili • 0

Hi. I need to download all available fungal genomes for my community analysis using kraken (sequence classifier tool). It doesn't have any assistance for acquiring fungal database so i have to download myself. For custom database, the program needs genome sequences in fasta file and the header should contain gi number. I have tried looking in NCBI in the first place, but fungal genomes in the ftp (refseq and genbank folders in genomes/) do not contain gi numbers. I have also tried other websites aside ncbi but to no avail. Can someone help me? Thank you very much!

fungi fungal genomes ncbi • 4.3k views
1
Entering edit mode
1
Entering edit mode

See ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/README for accession - gi mapping info.

You can parse the accessions from downloaded data and use e.g. sed to replace them with gis..

0
Entering edit mode
5.3 years ago

I know that this question is already quite old, but I now implemented a new package named biomartr that can perform bulk retrieval of genomes, proteomes, cds, gff, etc. Since the actual question is "Download fungal genomes" I will provide some biomartr based examples as a reference for people who in the future search for a way to bulk download all fungal genomes from NCBI RefSeq or Genbank.

To download all fungi genomes from NCBI RefSeq, one can simply type:

# download all fungi genomes from NCBI RefSeq
biomartr::meta.retrieval(kingdom = "fungi", db = "refseq", type = "genome")


Alternatively, genomes from NCBI Genbank can be retrieved by typing:

# download all fungi genomes from NCBI Genbank
biomartr::meta.retrieval(kingdom = "fungi", db = "genbank", type = "genome")


However, you are not limited to genomes. You can also download proteomes (type = "proteome"), coding sequences (type = "CDS"), and annotation files (type = "gff").

In case you wish to download only specific subgroups of fungi genomes, you can consult the getGroups() function to obtain available subgroups:

# retrieve available subgroups for the fungi kingdom
getGroups(db = "refseq", kingdom = "fungi")


"Ascomycetes" "Basidiomycetes" "Other Fungi"

We can now choose the group "Ascomycetes" and download the genomes of all fungi species that correspond to that group by typing:

# download all fungi genomes from NCBI RefSeq that belong to the subgroup Ascomycetes
meta.retrieval(kingdom = "fungi", group = "Ascomycetes", db = "refseq", type = "genome")