Stand alone Blast annotation with multiple databases
1
0
Entering edit mode
6.1 years ago
Biogeek ▴ 420

Dear users,

 

 

 

 

 

Apologies for asking such a broad and somewhat obvious question to some, but I am wanting to annotate a new transcriptome across several databases (inc. NCBI, Swissprot, Uniprot, custom databases) from my standalone BlastX server set-up on a cluster.

Whilst I am comfortable blasting against 1 database using BlastX. How can I go about annotating an assembly via multiple databases and collating all the information and the best suited hits? Can this be run in one process or do I need specialised scripts to combine and sort out all the information collated?

Answer may be obvious to many, but I am still learning more on command line and scripting.

Thanks for the help!

 

 

 

 

 

annotation blast • 1.8k views
ADD COMMENT
1
Entering edit mode
6.1 years ago

Regarding: "inc. NCBI, Swissprot, Uniprot, custom databases" as far as I know Swissprot, Uniprot, etc. are all contained in the NCBI NR database you can obtain using update_blastdb. Swissprot for example is a subset of NR and provided as an alias, you would still need to download NR to blast against Swissprot. If you need more options to generate compound databases see: How To Blast A Sequence Against Multiple Databases

I recommend the following options:

  1. blastdb_aliastool: use this if you want to join several databases and use the resulting compound multiple time
  2. use blasts -db option with multiple databases, in case of a large number of possible combinations and little re-use of any compound database, e.g. users may select some blast databases from a large number of custom databases on a web-server. To use the -db parameter in this way, seems to be an undocumented feature.
ADD COMMENT
0
Entering edit mode

Thanks Michael

So I can combine all databases using the blastdb_aliastool function of BLAST+ on Linux command, then start the blastX process from there. Assuming I do it this way, the best hit will come from the combined dataset, correct? Just trying to understand the process in my head.

What is the difference between aliastool and the -db function?

I only have 4 databases max, so I guess aliastool would be best for me?

Apologies for such simple questions.

ADD REPLY
0
Entering edit mode

Hello Michael Dondrup, how big is the size of the nr database? I'm trying to blast against SwissProt database only but from your answer above also have to download the nr database.

ADD REPLY

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6