Entering edit mode
4.7 years ago
eli_bayat
▴
90
I am using the below command line to run blast on my sequences:
blastn -db nt -query $inputfile -outfmt "6 qseqid evalue bitscore stitle" -out $outputfile -remote
However, when I run blast on multiple files at the same time, some of the sequences are skipped and I end up with lower number of sequences than there are in my original file. I assumed there should be some CPU limit issue, or max requests issue. Is there anyway to know the limits? How can I ensure I stay under them? Thanks
I don't know if the new limits put in place for web blast apply to the
-remote
option (they likely do). You can see them here : https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastNewsYou can try setting the
e-value
explicitly to a higher value (default is now 0.05).Thank you for the link. It is very helpful. Do you mean if I set the e-value higher, the probability that the blast skip sequences is lower?
How many sequences were you passing to
blast
? And incidentally, what version ofblast
are you using?I have multiple Fasta files and I blast each in parallel. So over all I passed 6845 sequences each has around 6000 positions. The version is 2.9.0. When I do blast individually, meaning I blast one file and wait till its done and then blast the next one, it skips a lot less compare to when I run all of the in parallel.
That's really strange that anything's getting skipped at all. These sequences that get skipped, do they have matches when they're blasted individually? What database are you searching against, by the way?