How To Blast A Sequence Against Multiple Databases
6
7
Entering edit mode
11.0 years ago
Manju ▴ 50

Hello,

I have downloaded all the chromosome of Bos taurus and I have changed them in blast format using makeblastdb..and now I want to locally blast my sequence against these all chromosomes. now I have 29 databases. Is there any method by which I am able to blast my sequence against all 29 databases in my program.

What should I write in database????

@params = ('database' => '????????', 'outfile' => 'blast2.out', '_READMETHOD' => 'Blast', 'prog'=> 'blastn');

Thanks Manju Rawat

blast • 15k views
ADD COMMENT
11
Entering edit mode
10.8 years ago
Torst ▴ 970

There is no need to create custom single databases (real or alias).

I'm pretty sure the BLAST "-d" and BLAST+ "-db" accept SPACE-SEPARATED multiple databases on the one command line. You just have to ensure you quote them in your shell, or within a Perl string. eg.

blastall -p blastp -d "nr sprot trembl" -i q.fa
ADD COMMENT
9
Entering edit mode
11.0 years ago
Digiomics ▴ 170

You could also use the blastdb_aliastool included in the NCBI blast package to aggregate your BLAST databases to a single virtual database.

ADD COMMENT
8
Entering edit mode
11.0 years ago
Chris ★ 1.6k

Why not creating a single database by merging the chromosomes into one big fasta file and then formatting it?

Not sure what you mean by this: @params = ('database' => '????????', 'outfile' => 'blast2.out', '_READMETHOD' => 'Blast', 'prog'=> 'blastn');

ADD COMMENT
2
Entering edit mode

This answer is bad. Why is everyone upvoting it? That is so much more work and may not even be possible in some cases to do.

ADD REPLY
1
Entering edit mode

I don't see how this approach is either bad or much more work. The objective of the OP is to have a single database. Chris proposes to merge the fasta files and create one database while you suggest to combine existing databases. In this case, the OP had access to the original fasta files, which a simple 'cat' command would have joined and then he could have used a command he already knew about and had used to create his databases. Clearly, both solutions seem workable. Your solution also happens to be the second most upvoted solution, which has been there for over two years. It seems to me that there is no need to call any answer bad and to question the judgement of multiple users over the course of 2.3 years. Even less on the first day you joined this forum ;)

ADD REPLY
1
Entering edit mode

The answer is bad. The user asked how to blast against 29 databases. "Is there any method by which I am able to blast my sequence against all 29 databases in my program?"

The answer "go compile a new database" is an indirect work around to the problem. It may even be good advice in this particular instance - but it did not answer the question asked.

Why don't I want to do that? Because it's moronic to create multiple permutations of databases when I already have them compiled. This means my usage of disk space balloons every time I do a search. Not to mention it is annoying and time consuming.

I came here with the exact same question. Thankfully there are several good options provided by others below.

ADD REPLY
0
Entering edit mode

If you have limited RAM (4Gb) then sometimes it is not possible to concatenate all the chromosome fasta files and then create the database at once. I just tried this on a 4GB RAM laptop with wheat genome (17Gb) and it stalled the laptop. Creating databases for individual chromosomes and then combining those databases with blastdb_aliastool is a much better option in these cases.

ADD REPLY
0
Entering edit mode

It looks suspiciously like he's trying to run blast via a perl script.

ADD REPLY
0
Entering edit mode

Ah, right, now I remember. The '@'-construct is a hint... Well, since I'm in Python I do not come across such things very often. ;)

ADD REPLY
3
Entering edit mode
10.8 years ago

You can do your alias easily by yourself:

example:

cd myblastdbs/
gedit myOwnblastDB.nal

then follow this (an example is the genbank nt.nal file):

#
# myOwnblastDB.nal is an alias file for all my chromosomes
#
#
TITLE all my chromosomes
#
DBLIST chr1 chr2 chr3 chr4 chri chr24
#
#GILIST
#
#OIDLIST
#

It is very simple, and works very fine!

ADD COMMENT
1
Entering edit mode
8.7 years ago
axa9070 ▴ 30

http://www.ncbi.nlm.nih.gov/books/NBK1763/#CmdLineAppsManual.Aggregate_existing_BLA

To combine the two nematode nucleotide databases, named “nematode_mrna” and “nematode_genomic", we use the following command line:

$ blastdb_aliastool -dblist "nematode_mrna nematode_genomic" -dbtype nucl \ -out nematode_all -title "Nematode RefSeq mRNA + Genomic"

ADD COMMENT
0
Entering edit mode
11.0 years ago

If you place your 29 chromosomes in the same fasta file, you should be able to create your single database using the makeblastdb program. Just use blastn then to blast on the database containing all your chromosomes.

ADD COMMENT

Login before adding your answer.

Traffic: 749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6