Question: database for fungus ?
0
gravatar for Picasa
4.9 years ago by
Picasa590
Picasa590 wrote:

In order to find some fungi in my dataset,

Can you recommend a complete fungi database ?

Thanks

fungus databse • 2.2k views
ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Picasa590
2
gravatar for 5heikki
4.9 years ago by
5heikki9.3k
Finland
5heikki9.3k wrote:

ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/fungi/

ADD COMMENTlink written 4.9 years ago by 5heikki9.3k
1
gravatar for Emily_Ensembl
4.9 years ago by
Emily_Ensembl21k
EMBL-EBI
Emily_Ensembl21k wrote:

http://fungi.ensembl.org/index.html

ADD COMMENTlink written 4.9 years ago by Emily_Ensembl21k
0
gravatar for Picasa
4.9 years ago by
Picasa590
Picasa590 wrote:

Thanks but do I need to download each species ? or there is a global database. (Im looking for fasta actually)

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Picasa590
1

In future use the "Add comment" button when you are providing supplementary information or comments.
You would need to get the fasta genome files from the link @5heikki provided. Similar link for Ensembl is here. It may be possible to recover these from refseq_genome blast database (if you have that handy) and the blastdbcmd tool.

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by GenoMax96k
1

Something like below to download the latest fungal assemblies from GenBank into a dir:

wget -q -O- ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/fungi/assembly_summary.txt | cut -f 20 -d $'\t' | awk '{FS="/"}{print $0"/"$6"_genomic.fna.gz"}' | xargs -n 1 wget
ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by 5heikki9.3k

Thanks for your help, can you explain your command ??

ADD REPLYlink written 4.8 years ago by Picasa590
1
wget -q -O- ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/fungi/assembly_summary.txt

Download ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/fungi/assembly_summary.txt and output to STDOUT.

cut -f 20 -d $'\t'

Parse the 20th column (columns separated by tabs) from STDIN and output to STDOUT (20th column of the assembly_summary.txt file has the base ftp urls for the genome assemblies).

awk '{FS="/"}{print $0"/"$6"_genomic.fna.gz"}'

Print the base ftp url and append: slash, the 6th field of the base ftp url, and _genomic.fna.gz. Output to STDOUT (fields separated by slashes, e.g. ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_001600535.1_JCM_30696_assembly_v001 becomes ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_001600535.1_JCM_30696_assembly_v001/GCA_001600535.1_JCM_30696_assembly_v001_genomic.fna.gz). As far as I know, this is a valid url construction method to all latest GenBank assemblies.

xargs -n 1 wget

Provide each constructed ftp url to wget one by one from STDIN:

Commands are glued together by pipes, i.e. "|"

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by 5heikki9.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1683 users visited in the last hour
_