NcbiblastnCommandline not detecting hits
2
0
Entering edit mode
9.1 years ago

Hi guys, for some reason Biopython's Bio.Blast.Applications.NcbiblastnCommandline() is not returning any hits - even for sequences with 100% overlap.

Here is my self-contained example (though this problem is not limited to these specific sequences):

from Bio.SeqRecord import SeqRecord
from Bio import Entrez
Entrez.email = "A.N.Other@example.com"     # Always tell NCBI who you are
from Bio.Blast.Applications import NcbiblastnCommandline
from Bio.Blast.Applications import NcbiblastpCommandline
from Bio.Blast import NCBIXML
from StringIO import StringIO
from Bio.Seq import Seq
from Bio import SeqIO
from Bio.Alphabet import generic_dna

handle = Entrez.efetch(db="nucleotide", id="aj627603", rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
SeqIO.write(record, ".cache/aj627603.fasta", "fasta")

cre_fw=SeqRecord(Seq("aaatttgcctgcattaccg", generic_dna), id="cre_fw")
SeqIO.write(cre_fw, ".cache/cre_fw.fasta", "fasta")

output = NcbiblastnCommandline(query=".cache/cre_fw.fasta", subject=".cache/aj627603.fasta", outfmt=5)()[0]
print(output)
output = NcbiblastpCommandline(query=".cache/cre_fw.fasta", subject=".cache/aj627603.fasta", outfmt=5)()[0]
print(output)

You can check the subject online and that see that it contains the query precisely. Not least of all I find it curious that NcbiblastpCommandline (p for proteins) gives me a correct hit while NcbiblastpCommandline (n for nucleotides) does not.

alignment sequence blast biopython • 3.1k views
ADD COMMENT
1
Entering edit mode
9.1 years ago

This particular issue was solved by adding task="blatsn" to the NcbiblastnCommandline arguments. In the absence of this argument, the function uses mega-blast instead of blastn, which for some reason seems unsuitable for all queries which I have tried to this point.

The solution was provided here: https://github.com/biopython/biopython/issues/489

ADD COMMENT
0
Entering edit mode
9.1 years ago
wpwupingwp ▴ 120

Here is example from biopython.org:

>>> from Bio.Blast.Applications import NcbiblastnCommandline
>>> cline = NcbiblastnCommandline(query="m_cold.fasta", db="nt", strand="plus",

I don't know if it works when you use the parameter subject=....

And usually, you need to run makeblastdb to get a db file.

ADD COMMENT

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6