Question

Getting Untranslated Dna Subject String As Output From Tblastn Via Biopython

0

Entering edit mode

11.9 years ago

agatorano ▴ 50

when running alignment using tblastn via biopython, the subject string that is returned is the translated version of the DNA strand we blasted against. Is there a way to return the original DNA string, instead of the translated version?

biopython • 3.2k views

ADD COMMENT • link written 11.9 years ago by agatorano ▴ 50

1

Entering edit mode

Could you explain a little more in depth what problem you are experiencing? What were your inputs and what outputs did you expect from biopython?

ADD REPLY • link 11.9 years ago by Josh Herr 5.8k

score 5 · Answer 1 · 2012-12-02

5

Entering edit mode

11.9 years ago

bow ▴ 790

This is a limitation of the BLAST XML output itself: it doesn't keep the original sequence. Biopython only parses this output into a user-friendly data structure. Without any information regarding the original sequence in the BLAST XML file, the original sequence couldn't be returned.

You could reverse translate the given protein sequence. But given the codon redundancy, it may be impossible to figure out the original DNA sequence from the BLAST XML file alone.

ADD COMMENT • link 11.9 years ago by bow ▴ 790

1

Entering edit mode

+1 . Going beyond the xml output, one option, depending on your set up, is to use Bio.SeqIO.to_dict() to create a (in memory) dictionary of the sequences that make up the BLAST database. Then you can use the subject id to retrieve the original sequence.

ADD REPLY • link 11.9 years ago by David W 4.9k

2

Entering edit mode

Unless you have a quite small database FASTA file, rather than an in memory index with Bio.SeqIO.to_dict(), probably Bio.SeqIO.index() or Bio.SeqIO.index_db() would be more sensible.

Or, and this is a good plan if don't have a FASTA file of the database, you could use blastdbcmd - although that isn't always as easy as it should be: http://blastedbio.blogspot.co.uk/2012/10/my-ids-not-good-enough-for-ncbi-blast.html

ADD REPLY • link 11.9 years ago by Peter 6.0k