Question: Extracting the <Hit_def> from Blast xml output using Biopython and saving in .csv
gravatar for Anushka
3.8 years ago by
Anushka20 wrote:

I have the blast output in .xml form and I want to retrieve few attributes like <hit_def>. I found the parser on biophython.

from Bio.Blast import NCBIXML
blast = NCBIXML.parse(open('output.xml', 'rU'))
for record in blast:
    for align in record.alignments:
        for hsp in align.hsps:
            print hsp.score, align.hit_def

Q: Above code is just printing the out put on the terminal. Could anyone help me how to store the output file in .csv format.

Specifically, I need output.csv with these attribute <Iteration_query-def>, <Hit_def>, <Hsp_score>, <Hsp_evalue> as columns, in a .csv format.

Q2: How can I to get the result just for the best hit of each query ? While running blastp setting -max_target_seqs to 1 will do the same?

Following is a segment of my input xml

          <Hit_def>low-density lipoprotein receptor-related protein 6 precursor [Homo sapiens] &gt;gi|578822872|ref|XP_006719141.1| PREDICTED: low-density lipoprotein receptor-related protein 6 isoform X1 [Homo sapiens]</Hit_def>
              <Hsp_midline>+N C   +  C H+CL R  G   C C  GF L+S  K C+   V + ++L +     R   L    +        V +  A+D D VTD+RIY   +  KT   A+ + SA E V  +G       D    +      K +YW   TG    + VS    +   V  + D    R + +D     +YW E+</Hsp_midline>
              <Hsp_midline>NEC  S   C H+CLA   GGFVC C   ++L +  +  S   T            +V D  Q     LPI  S RNV    AID D + D ++Y</Hsp_midline>

I would really appreciate your help.


bioython blastp blast python • 2.0k views
ADD COMMENTlink modified 3.8 years ago by RamRS17k • written 3.8 years ago by Anushka20

using xsltproc rather than python would be straighforward.

ADD REPLYlink written 3.8 years ago by Pierre Lindenbaum112k
gravatar for RamRS
3.8 years ago by
Houston, TX
RamRS17k wrote:

You could redirect output to a CSV file using File IO. Open a file in write mode and modify the print so it writes into the file. Here's one of many resources:

Google away for more. This link should help you get the attributes you require.

Q2: Best hit is an ambiguous term. Each hit can have multiple HSPs and you'd need to average or sum across HSP scores to find the "best" alignment.


ADD COMMENTlink written 3.8 years ago by RamRS17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 907 users visited in the last hour