Hi there!
I'm trying to automate some blast searches with biopyhton, but instead of returning me the .txt file, it creates an empty file. I've tried to run it direct in linux terminal, but the result was the same.
in biopython:
from Bio.Blast.Applications import *
path = str(input("Type the path to fasta file: "))
comando_blastn = NcbiblastnCommandline(query= path, task="blastn-short", remote=True, db="nt", \
outfmt='6 qseqid qcovs sscinames pident evalue', out= "out.txt")
print(comando_blastn)
stdout, stderr = comando_blastn()
blast_result = open("out.txt", "r")
lines = blast_result.read()
print(lines)
in terminal:
blastn -out outComandLine.txt -outfmt "6 qseqid qcovs sscinames pident evalue" -query Ensaio_toxo.fasta -db nt -remote -task blastn-short
I'm using -task blastn-short
due to length of my sequences, but I've already tried without it.
I'm quite new to this, so I wouldn't be surprised if there's some mistake in my code.
Does some one know what could be going wrong here?
EDIT:
I tried to see if there was any hits, so I ran:
blastn -query Ensaio_toxo.fasta -db nt -remote -task blastn-short -out outComandLine.txt
And the result was the following:
BLASTN 2.12.0+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Database: Nucleotide collection (nt)
81,664,709 sequences; 723,646,484,289 total letters
Query= FwToxo
Length=18
RID: 6N1SGV3V013
***** No hits found *****
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Effective search space used: 1444679697890
Database: Nucleotide collection (nt)
Posted date: Apr 23, 2022 07:59 AM
Number of letters in database: 723,646,484,289
Number of sequences in database: 81,664,709
Matrix: blastn matrix 1 -3
Gap Penalties: Existence: 5, Extension: 2
In command line it returns no hits, but when I run it on Web BLASTn platform it returns hits normally. It seems that there is some problem with BLAST command line, not exactly with the code...
That would indicate that there is a problem with your command line. Can you show us the command line you used for the direct search (not via biopython).
I used this:
and runed on terminal.
Add a
2> cmd.err
to your command and show us the contents of that file once you run it.The result was the following:
Then I installed the taxdb, and ran again. The result remains empty... The cmd.err file was empty after that too.
There is something wrong with your query file. Can you show us output of
grep -A 2 "^>" Ensaio_toxo.fasta
? If you take some sequences from that file and run a search directly at NCBI using the web interface do you get results?Since you are running the search remotely at NCBI there is no point in installing the taxonomy database locally since it will not be used.
grep -A 2 "^>" Ensaio_toxo.fasta
returns my sequences inside the file normally. When I run it directly at NCBI web interface it returns me results normally. The problem is just with the command line application. If I run without-outfmt 6
it returns me the following:Then I suggest that you make a note of the parameters (you will find them under
Search Summary
) that the web application is using and try to replicate them exactly on local command line.Thanks GenoMax ! It worked! maybe this
-task blastn-short
isn't working properly.I used the following command in shell:
But to use the
sscinames
option in-outfmt 6
you will have to download taxid database.I assume query file contains one or more fasta formatted sequences? If not that could be one issue. Can you try the variation below?
Yes, it has pries and probes in fasta format. I tried this variation and the result was the same...an empty file.