Hi. I need to download all available fungal genomes for my community analysis using kraken (sequence classifier tool). It doesn't have any assistance for acquiring fungal database so i have to download myself. For custom database, the program needs genome sequences in fasta file and the header should contain gi number. I have tried looking in NCBI in the first place, but fungal genomes in the ftp (refseq and genbank folders in genomes/) do not contain gi numbers. I have also tried other websites aside ncbi but to no avail. Can someone help me? Thank you very much!
I know that this question is already quite old, but I now implemented a new package named biomartr that can perform bulk retrieval of genomes, proteomes, cds, gff, etc. Since the actual question is "Download fungal genomes" I will provide some biomartr based examples as a reference for people who in the future search for a way to bulk download all fungal genomes from NCBI RefSeq or Genbank.
To download all fungi genomes from NCBI RefSeq, one can simply type:
# download all fungi genomes from NCBI RefSeq biomartr::meta.retrieval(kingdom = "fungi", db = "refseq", type = "genome")
Alternatively, genomes from NCBI Genbank can be retrieved by typing:
# download all fungi genomes from NCBI Genbank biomartr::meta.retrieval(kingdom = "fungi", db = "genbank", type = "genome")
However, you are not limited to genomes. You can also download proteomes (type = "proteome"), coding sequences (type = "CDS"), and annotation files (type = "gff").
In case you wish to download only specific subgroups of fungi genomes, you can consult the getGroups() function to obtain available subgroups:
# retrieve available subgroups for the fungi kingdom getGroups(db = "refseq", kingdom = "fungi")
"Ascomycetes" "Basidiomycetes" "Other Fungi"
We can now choose the group "Ascomycetes" and download the genomes of all fungi species that correspond to that group by typing:
# download all fungi genomes from NCBI RefSeq that belong to the subgroup Ascomycetes meta.retrieval(kingdom = "fungi", group = "Ascomycetes", db = "refseq", type = "genome")
For more information please consult the Metagenome Retrieval Vignette. I hope it helps.