Legacy Blastx Output (Karlin-Altschul Statistics) From Xml Output
0
0
Entering edit mode
11.4 years ago

If have to generate legacy blastx output from Blast XML output (why do people still require this!!!?). The legacy output has a block starting with

                                                             Score    E
Sequences producing significant alignments:                  (bits) Value  N

dbseq_1                                                           399   e-109  2
dbseq_2                                                           389   e-106  3

The parameters N is documented as follows:

Ungapped alignments and results from blastx and tblastn will have an additional column ('N'), displaying the number of different segment pairs used to produce the alignment, according to the Karlin-Altschul statistics.

I have no idea, how to get this value out of the Blast XML file. Can anybody help me? If not, this post is intended to clarify this issue for others in future.

blast xml • 2.5k views
ADD COMMENT
1
Entering edit mode

Am I wrong ? I cannot find this parameter "N" in a blastx/XML.

<Hit>
  <Hit_num>1</Hit_num>
  <Hit_id>gi|348569288|ref|XP_003470430.1|</Hit_id>
  <Hit_def>PREDICTED: zinc finger CCCH domain-containing protein 7B-like [Cavia porcellus]</Hit_def>
  <Hit_accession>XP_003470430</Hit_accession>
  <Hit_len>1345</Hit_len>
  <Hit_hsps>
    <Hsp>
      <Hsp_num>1</Hsp_num>
      <Hsp_bit-score>44.669</Hsp_bit-score>
      <Hsp_score>104</Hsp_score>
      <Hsp_evalue>0.0149519</Hsp_evalue>
      <Hsp_query-from>231</Hsp_query-from>
      <Hsp_query-to>368</Hsp_query-to>
      <Hsp_hit-from>322</Hsp_hit-from>
      <Hsp_hit-to>373</Hsp_hit-to>
      <Hsp_query-frame>3</Hsp_query-frame>
      <Hsp_hit-frame>0</Hsp_hit-frame>
      <Hsp_identity>27</Hsp_identity>
      <Hsp_positive>30</Hsp_positive>
      <Hsp_gaps>6</Hsp_gaps>
      <Hsp_align-len>52</Hsp_align-len>
      <Hsp_qseq>VSREWGEAERTATH------LATA*WTVFSPQRLMERQRRKADIEKGLQFIQ</Hsp_qseq>
      <Hsp_hseq>LDEPWGKGWGRAGHRPALAVLLTDGLHALSLQRLMERQKRKADIEKGLQFIQ</Hsp_hseq>
      <Hsp_midline>+   WG+    A H      L T      S QRLMERQ+RKADIEKGLQFIQ</Hsp_midline>
    </Hsp>
  </Hit_hsps>
</Hit>
<Hit>
ADD REPLY
0
Entering edit mode

I cannot find it either. I just wanted to make sure it is really not there.

ADD REPLY
0
Entering edit mode

"why do people still require this!!!?"=> ?

ADD REPLY
1
Entering edit mode

I mean, why do programs depend on unstable text output instead of structured XML/ASN.1 or similar.

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6