Question: how to get the strand from a blast xml file
0
gravatar for bio90029
16 months ago by
bio9002910
bio9002910 wrote:

Hi, I am working with a blast xml as the one below:

 <Hit_num>2</Hit_num>
  <Hit_id>gnl|BL_ORD_ID|79</Hit_id>
  <Hit_def>NODE_82_length_56137_cov_22.281775</Hit_def>
  <Hit_accession>79</Hit_accession>
  <Hit_len>56195</Hit_len>
  <Hit_hsps>
    <Hsp>
      <Hsp_num>1</Hsp_num>
      <Hsp_bit-score>80.5295</Hsp_bit-score>
      <Hsp_score>46</Hsp_score>
      <Hsp_evalue>3.65762e-15</Hsp_evalue>
      <Hsp_query-from>274</Hsp_query-from>
      <Hsp_query-to>515</Hsp_query-to>
      <Hsp_hit-from>17046</Hsp_hit-from>
      <Hsp_hit-to>17290</Hsp_hit-to>
      <Hsp_query-frame>1</Hsp_query-frame>
      <Hsp_hit-frame>1</Hsp_hit-frame>
      <Hsp_identity>182</Hsp_identity>
      <Hsp_positive>182</Hsp_positive>
      <Hsp_gaps>13</Hsp_gaps>
      <Hsp_align-len>250</Hsp_align-len>
      <Hsp_qseq>ATGCGCGAAAAAGACGAGAAAATTCCGCTGCGTCACGGTGTGCTGGGTCAGGAAACCTGCGGCCCGGGTGGGATCTCTTACGGCATGCGCTCCAT-CGGCGGCGTACTGG-A-A-CTGGTGGATTATATGGAGCAGTACTC-CCCG--AACGCGTGGATGCTGAACTACTCCAACCCGGCGGCGATTGTGGCGGAAG-CGACGCGCCGCCTGCGTCCGCATGCCAAAATTCTCAACATCTGCGATATGCC</Hsp_qseq>
      <Hsp_hseq>ATGCGCGAGCAGGATGAAAAGATCCCGCTGCGCCACGGGGTGGTCGGCCAGGAAACCTGCGGACCCGGCGGGCTGGCATATGGCCTGCGTACTATCCTGC--CG-A-TGGTAGAGCTGATCGATCGGGTGGAACGCTACGCGCACGAAAAGGCGTGGATCGTGAACTACTCCAACCCGGCGGCGATTGTCGCTGAGGGCGT-GCGTCGCCTGCGCCCCAACGCACGCGTGCTGAACATCTGCGATATGCC</Hsp_hseq>
      <Hsp_midline>||||||||  | || || || || |||||||| ||||| ||| | || |||||||||||||| || || ||| |  | || ||| ||||  | || | ||  || | ||| | | ||| | |||    |||| |  ||| | | ||  || ||||||||  |||||||||||||||||||||||||||| || || | ||  ||| |||||||| ||  | ||     | || |||||||||||||||||</Hsp_midline>
    </Hsp>

I am using a script based on the one provide by biopython Cook's book (below) but this is what I get;

(None, None)

from Bio.Blast import NCBIXML
 blast_record = NCBIXML.read(result_handle)
for alignment in blast_record.alignments:
... for hsp in alignment.hsps:
        print hsp.strand

Can anyone help me to extract in which strand the subj.sequence is, please? I would like to do this so I can work with the sequence. What I would like to do is to get to the contigs and add the missing bases but if it is in the strand -1 I need to reversed_complement() the sequence. Thanks

biopython python • 615 views
ADD COMMENTlink modified 14 months ago by toralmanvar750 • written 16 months ago by bio9002910
2
gravatar for toralmanvar
14 months ago by
toralmanvar750
toralmanvar750 wrote:

you may try greping <hsp_qseq> and <hsp_hseq> lines showing alignment of query with that of subject (irrepective of the strand in which it belong) and can replace the missing base in contigs.

ADD COMMENTlink written 14 months ago by toralmanvar750

Thanks, Unluckily, I cannot used grep as I wanted it as part of a python programme I was working on. By now I had finished it. I worked out the strand by the start and end positions.

ADD REPLYlink written 13 months ago by bio9002910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 911 users visited in the last hour