Which database to select for bacterial ortholog and singleton detection with ncbi-blast+?
1
0
Entering edit mode
10.0 years ago
Naren ▴ 1000

Which database to select for ncbi-blast-2.2.29+-x64-linux at ftp://ftp.ncbi.nlm.nih.gov/blast/db

I want to do all vs all gene blast on protein or nucleotide sequences of bacterial genomes, so as to get shared orthologs and unique genes between species.

ncbi-blast-plus • 2.9k views
ADD COMMENT
1
Entering edit mode

Please rephrase this question as complete sentences. It is not clear at all what you want to do, please provide additional information about your project. In principle you can use any blast database, obviously which one you choose depends on your question.

ADD REPLY
0
Entering edit mode

Saw your edit, still got some questions:

Possibly your question is to identify pan/core-genomes of bacterial species?

  • are all or some of these sequences already in the database, or are all or at least some of them newly sequenced?
  • How many comparisons do you want to make (hope not all bacteria in NR vs. all)?
  • Do you want to use a restricted subset based on tax-ids (e.g. all Bacteroidetes)?
  • How closely related are those? (determins whether to use nucleotide or AA sequences)

In any case you can most likely (if you need more than a few bacterial genomes) start to download the complete NR and NT databases, those can then be filtered by taxon using the NCBI gi<->taxid mapping file.

ADD REPLY
1
Entering edit mode
10.0 years ago

This depends on what you want to align and what for. I suggest you to have a look at the Blast databases documentation for a list of all the options.

Most people will use the nr (non-redundant) database, containing all the known protein sequences in the major databases.

ADD COMMENT

Login before adding your answer.

Traffic: 2379 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6