Convert GTDB database to blast database
1
0
Entering edit mode
22 months ago
emi • 0

I am checking for homologs of a specific gene in a representative tree of bacteria. I was given a list of representative bacteria to use, however, the list was from GTDB. Is there a way for me to convert the GTDB to a taxid database I can use to run blastp, or is there a better way for me to search for the presence of homologs in each of these species?

GTDB blast • 1.2k views
ADD COMMENT
0
Entering edit mode

I was given a list of representative bacteria to use, however, the list was from GTDB.

I think there is a specific reason why the list was from GTDB instead of NCBI. GTDB is a curated taxonomy database while NCBI is not. In other words, the taxonomic lineage of a genome in GTDB does not necessarily match the taxonomic lineage in NCBI.

Is there a way for me to convert the GTDB to a taxid database

Can you give an example of the GTDB list you have?

ADD REPLY
0
Entering edit mode

RS_GCF_005380545.1 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Escherichia;s__Escherichia flexneri

Here's an example row from the list. Thanks so much for your help!

ADD REPLY
0
Entering edit mode

The GCF_005380545.1 is the NCBI Assembly accession number of that Escherichia flexneri (in NCBI is identified as Escherichia coli!) you have in the GTDB list. You can use these accession numbers to download the protein fasta file (.faa) of each genome in that list to create your database with makeblastdb

ADD REPLY
0
Entering edit mode
22 months ago
Mensur Dlakic ★ 27k

This could help:

https://gtdb.ecogenomic.org/tools

ADD COMMENT

Login before adding your answer.

Traffic: 2545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6