Question: how to get the strand from a blast xml file
0
gravatar for bio90029
14 months ago by
bio9002910
bio9002910 wrote:

Hi, I am working with a blast xml as the one below:

 <Hit_num>2</Hit_num>
  <Hit_id>gnl|BL_ORD_ID|79</Hit_id>
  <Hit_def>NODE_82_length_56137_cov_22.281775</Hit_def>
  <Hit_accession>79</Hit_accession>
  <Hit_len>56195</Hit_len>
  <Hit_hsps>
    <Hsp>
      <Hsp_num>1</Hsp_num>
      <Hsp_bit-score>80.5295</Hsp_bit-score>
      <Hsp_score>46</Hsp_score>
      <Hsp_evalue>3.65762e-15</Hsp_evalue>
      <Hsp_query-from>274</Hsp_query-from>
      <Hsp_query-to>515</Hsp_query-to>
      <Hsp_hit-from>17046</Hsp_hit-from>
      <Hsp_hit-to>17290</Hsp_hit-to>
      <Hsp_query-frame>1</Hsp_query-frame>
      <Hsp_hit-frame>1</Hsp_hit-frame>
      <Hsp_identity>182</Hsp_identity>
      <Hsp_positive>182</Hsp_positive>
      <Hsp_gaps>13</Hsp_gaps>
      <Hsp_align-len>250</Hsp_align-len>
      <Hsp_qseq>ATGCGCGAAAAAGACGAGAAAATTCCGCTGCGTCACGGTGTGCTGGGTCAGGAAACCTGCGGCCCGGGTGGGATCTCTTACGGCATGCGCTCCAT-CGGCGGCGTACTGG-A-A-CTGGTGGATTATATGGAGCAGTACTC-CCCG--AACGCGTGGATGCTGAACTACTCCAACCCGGCGGCGATTGTGGCGGAAG-CGACGCGCCGCCTGCGTCCGCATGCCAAAATTCTCAACATCTGCGATATGCC</Hsp_qseq>
      <Hsp_hseq>ATGCGCGAGCAGGATGAAAAGATCCCGCTGCGCCACGGGGTGGTCGGCCAGGAAACCTGCGGACCCGGCGGGCTGGCATATGGCCTGCGTACTATCCTGC--CG-A-TGGTAGAGCTGATCGATCGGGTGGAACGCTACGCGCACGAAAAGGCGTGGATCGTGAACTACTCCAACCCGGCGGCGATTGTCGCTGAGGGCGT-GCGTCGCCTGCGCCCCAACGCACGCGTGCTGAACATCTGCGATATGCC</Hsp_hseq>
      <Hsp_midline>||||||||  | || || || || |||||||| ||||| ||| | || |||||||||||||| || || ||| |  | || ||| ||||  | || | ||  || | ||| | | ||| | |||    |||| |  ||| | | ||  || ||||||||  |||||||||||||||||||||||||||| || || | ||  ||| |||||||| ||  | ||     | || |||||||||||||||||</Hsp_midline>
    </Hsp>

I am using a script based on the one provide by biopython Cook's book (below) but this is what I get;

(None, None)

from Bio.Blast import NCBIXML
 blast_record = NCBIXML.read(result_handle)
for alignment in blast_record.alignments:
... for hsp in alignment.hsps:
        print hsp.strand

Can anyone help me to extract in which strand the subj.sequence is, please? I would like to do this so I can work with the sequence. What I would like to do is to get to the contigs and add the missing bases but if it is in the strand -1 I need to reversed_complement() the sequence. Thanks

biopython python • 559 views
ADD COMMENTlink modified 12 months ago by toralmanvar730 • written 14 months ago by bio9002910
2
gravatar for toralmanvar
12 months ago by
toralmanvar730
toralmanvar730 wrote:

you may try greping <hsp_qseq> and <hsp_hseq> lines showing alignment of query with that of subject (irrepective of the strand in which it belong) and can replace the missing base in contigs.

ADD COMMENTlink written 12 months ago by toralmanvar730

Thanks, Unluckily, I cannot used grep as I wanted it as part of a python programme I was working on. By now I had finished it. I worked out the strand by the start and end positions.

ADD REPLYlink written 11 months ago by bio9002910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1731 users visited in the last hour