Parsing <Iteration_query-def> <Hit_def> from BLAST xml file
0
0
Entering edit mode
7.9 years ago
biotech ▴ 570

I have results of command-line BLASTp and I would like to convert to table format. Seems BLASTp -outfmt 6 cannot parse <hit_def> field from xml file, right?

I'm thinking about Biopython. Thanks

blast • 1.9k views
ADD COMMENT
1
Entering edit mode

blastp -outfmt n is a way to define the output format. That program is not going to parse results from a previous run of blastp. What format did you actually use for the output?

ADD REPLY
0
Entering edit mode

I know what is -outfmt genomax2. I'm using -outfmt 5 to get xml output and trying to get a table with <iteration_query-def> <hit_def>. Seems outfmt 6 can't get hit description.

ADD REPLY
0
Entering edit mode

Since you had outfmt 6 in your original post I was not sure.

So you already have some code you are using? If not, take a look at the examples in the biopython tutorial here.
A practical example is in this thread: Extracting Accession Numbers With Ncbixml.Parse

ADD REPLY
0
Entering edit mode

Did you meant subject title for outfmt 6?

stitle means Subject Title
ADD REPLY
0
Entering edit mode

Yes. Thank you very much @genomax2. I just saw the request here http://blastedbio.blogspot.co.il/2012/05/blast-tabular-missing-descriptions.html . Seems outfmt descriptors are bit disorganized in the manual. But yes, its possible directly without parsing xml file. I'm now able to annotate eukaryotic proteins just with a simple BLASTp.

ADD REPLY

Login before adding your answer.

Traffic: 2367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6