Hi,
I want to ignore certain sequences for BLAST search. It takes longer if the db is huge and I guess bigger db might inflate Hit score. My solution was to use -entrez_query
option using nt
database available in our server. But -entrez_query
option needs -remote
option and this is imcompatible with using database in the server. To get around this, I access to NCBI nt database instead of the one downloaded to our server.
Here is the code which does not work but it should work, because it works in my laptop:
nohup tblastn -query NP_040593.1.fa \
-db nt \
-remote \
-entrez_query "Viruses[ORGN] NOT (SYNTHETIC[TI] OR ENVIRONMENTAL[TI] OR PATENT[TI]) NOT (UNVERIFIED[KYWD] OR STANDARD_DRAFT[KYWD] OR VIRUS_LOW_COVERAGE[KYWD] OR VIRUS_AMBIGUITY[KYWD])" \
-outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qframe sframe qcovs qcovhsp' \
-out tblastn_allFiltered.out\
-export_search_strategy export.txt &> nohup.out &
When I run this exact code in our server, I get this error:
Error: NCBI C++ Exception:
T0 "/home/ross/ncbi-blast-2.10.0+-src/c++/src/serial/rpcbase.cpp", line 233: Error: ncbi::CRPCClient_Base::x_Ask() - Failed to receive reply after 3 tries
I use NCBI API key which is supposeed to accept 10 request/second. I got the API key and export to ~/.bash_profile.
So, why is this not running in the cluster? Hope someone will help me here!
Thanks a million,
Asuman
Do you get the error right away or after some time?
Here it is(I run in server and blast exits with the error above and nohup gives error of
Exit 255
:real 1m39.167s
user 0m0.192s
sys 0m0.014s
When I run the same command on my local, I get this:
real 4m12.736s
user 0m0.307s
sys 0m0.059s
Hence, it is likely a server connection problem. If there is a way to filter database search without EntrezQuery, it can solve the problem I guess. But how to filter alternatively?
If you are running on a cluster is it possible to submit this as a job to a job scheduler without the nohup?
I corrected, I meant server not cluster..some old habits. It should be much easier to filter a database than accessing via Entrez which requires remote access and which gives the connection error. But what alternatives?