Blastn without nt database
1
0
Entering edit mode
8.3 years ago

Hey all!

I have some problems with blast:

  1. I need to blastn some sequences but I didn't want to download the nt database, what I have to put in -db?
  2. To do the blastn I have to generate a file with only the sequences? I have to use a .fasta file?
blastn • 4.0k views
ADD COMMENT
1
Entering edit mode
8.3 years ago
pld 5.1k

You can use the remote flag to run the query on the nt database without having to download it. Otherwise you have to download it, or at least a portion of it.

Yes, you should have a FASTA file of the sequences you intend to search with. BLAST will query with all of the sequences in the input file.

ADD COMMENT
1
Entering edit mode

You can also download subsets of nt. So if you only need certain species for instance. Just be aware that this can affect your e-values as they are based on database size. But if you are going to be doing a lot of high-throughput BLAST analyses you really should just download the database. It will go much, much faster than using the remote option. Not sure what the size of NT is these days but it has always been worth it in my experience. All through my PhD we maintained our own local versions of blast databases, including some custom subsets because it was just so much faster than anything else.

ADD REPLY
0
Entering edit mode

You can also manually adjust the search space size in cases where the database size has changed.

ADD REPLY
0
Entering edit mode

Yes. We always found it useful (at the time) to just feed arbitrarily large database sizes (we all used the same number at the time), because we frequently needed to do cross database comparisons.

ADD REPLY
0
Entering edit mode

Little note about using local fasta files: you can either have the databases generated from fasta files in the working directory or in a separate directory that the environmental variable BLASTDB is set to include.

ADD REPLY
0
Entering edit mode

Sure, you can specify the path to the file, just like the vast majority of programs.

ADD REPLY
0
Entering edit mode

I just edited my comment. After -db you need to specify a database name, not a specific file name. The database name prefixes the several database files. These files are searched for in the environment variable BLASTDB. If BLASTDB is not setup, they're searched for in the current directory that you ran the blast command from. You cannot provide the path to the database on the command line, as you can do with the majority of other programs.

ADD REPLY
0
Entering edit mode

Yes, you can. As you said, just the name of the database, not any specific file comprising the database.

ADD REPLY
0
Entering edit mode

For reference: I was wrong about this. I said you can't because I tried it and it gave me an error. Turns out the error was due to a space in the name of a directory (at least I know blast won't let you escape these). I tried changing the directory name and it worked.

ADD REPLY
0
Entering edit mode

Sorry but I can't understand very well! This is my first time with blast and bioinformatic!

I have to blast something like 1000 sequences so it is not a big dataset but I have not a cluster or a server so I don't know if the nt database is too much heavy for my computer.

I can put in -db a name of a blast database and automatically blast tool will search it on internet? Or I have to put in BLASTDB and it finally do the blast with all nt database? Or download the database is indispensable?

Thank you very much

ADD REPLY
0
Entering edit mode

BLAST isn't that resource heavy, especially for 1000 input sequences. If you're only doing this the once then running with the remote option might be a good idea and saves you having to do as much work on your end, but it will be slower. If you are going to start doing BLAST analyses fairly routinely you really should just download it for local use. You'll find it much less headache. I don't know how big NT is these days in terms of the file size, but you don't need a very fast machine to do BLAST.

ADD REPLY
0
Entering edit mode

I don,t have to do this analysis routinely so the remote option is the best for me but...how can a I run BLAST in remote? This is the question!

Thanks

ADD REPLY
0
Entering edit mode

One of the command-line options is -remote

ADD REPLY
0
Entering edit mode

thak you very much! I,ll try

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6