Question: Download fungal genomes
0
gravatar for d.pinili
3.5 years ago by
d.pinili0
d.pinili0 wrote:

Hi. I need to download all available fungal genomes for my community analysis using kraken (sequence classifier tool). It doesn't have any assistance for acquiring fungal database so i have to download myself. For custom database, the program needs genome sequences in fasta file and the header should contain gi number. I have tried looking in NCBI in the first place, but fungal genomes in the ftp (refseq and genbank folders in genomes/) do not contain gi numbers. I have also tried other websites aside ncbi but to no avail. Can someone help me? Thank you very much!

fungal genomes fungi ncbi • 2.4k views
ADD COMMENTlink modified 2.6 years ago by Hajk-Georg Drost130 • written 3.5 years ago by d.pinili0
1

NCBI is phasing out sequence GIs - use Accession.Version instead!

ADD REPLYlink written 3.5 years ago by Tanvir Ahamed 270
1

See ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/README for accession - gi mapping info.

You can parse the accessions from downloaded data and use e.g. sed to replace them with gis..

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by 5heikki8.5k
0
gravatar for Hajk-Georg Drost
2.6 years ago by
Cambridge
Hajk-Georg Drost130 wrote:

I know that this question is already quite old, but I now implemented a new package named biomartr that can perform bulk retrieval of genomes, proteomes, cds, gff, etc. Since the actual question is "Download fungal genomes" I will provide some biomartr based examples as a reference for people who in the future search for a way to bulk download all fungal genomes from NCBI RefSeq or Genbank.

To download all fungi genomes from NCBI RefSeq, one can simply type:

# download all fungi genomes from NCBI RefSeq
biomartr::meta.retrieval(kingdom = "fungi", db = "refseq", type = "genome")

Alternatively, genomes from NCBI Genbank can be retrieved by typing:

# download all fungi genomes from NCBI Genbank
biomartr::meta.retrieval(kingdom = "fungi", db = "genbank", type = "genome")

However, you are not limited to genomes. You can also download proteomes (type = "proteome"), coding sequences (type = "CDS"), and annotation files (type = "gff").

In case you wish to download only specific subgroups of fungi genomes, you can consult the getGroups() function to obtain available subgroups:

# retrieve available subgroups for the fungi kingdom
getGroups(db = "refseq", kingdom = "fungi")

"Ascomycetes" "Basidiomycetes" "Other Fungi"

We can now choose the group "Ascomycetes" and download the genomes of all fungi species that correspond to that group by typing:

# download all fungi genomes from NCBI RefSeq that belong to the subgroup Ascomycetes
meta.retrieval(kingdom = "fungi", group = "Ascomycetes", db = "refseq", type = "genome")

For more information please consult the Metagenome Retrieval Vignette. I hope it helps.

ADD COMMENTlink written 2.6 years ago by Hajk-Georg Drost130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1136 users visited in the last hour