Question

Off topic:parsing data from xml in python

0

Entering edit mode

8.5 years ago

khanshbz1605 • 0

I have a xml file:

<swissprot created="2010-12-20">
 <entrylevel dataset="abc">
    <references id="1">
        <title>first references</title>
        <author>
            <person name="Mr. A"/>
            <person name="Mr. B"/>
            <person name="Mr. C"/>
        </author>
        <score> score1 for id 1 </score>
        <score> score2 for id 1 </score>
        <score> score3 for id 1 </score>
    </references>
    <references id="2">
        <title>Second references</title>
        <author>
            <person name="Mr. D"/>
            <person name="Mr. E"/>
            <person name="Mr. F"/>
        </author>
        <score> score1 for id 2 </score>
        <score> score2 for id 2 </score>
        <score> score3 for id 2 </score>
    </references>
    <references id="3">
        <title>third references</title>
        <author>
            <person name="Mr. G"/>
            <person name="Mr. H"/>
            <person name="Mr. I"/>
        </author>
        <score> score1 for id 3 </score>
        <score> score2 for id 3 </score>
        <score> score3 for id 3 </score>
    </references>
    <references id="4">
        <title>fourth references</title>
        <author>
            <person name="Mr. J"/>
            <person name="Mr. K"/>
            <person name="Mr. L"/>
        </author>
        <score> score 1 for id 4 </score>
        <score> score 2 for id 4 </score>
        <score> score 3 for id 4 </score>
    </references>
  </entrylevel>
</swissprot>

I want the all references from this xml in a specific format:

First Reference
Mr A, Mr B, Mr C
score 1 for id 1, score 2 for id 1, score 3 for id 1

Second Reference
Mr D, Mr E, Mr F
score 1 for id 2, score 2 for id 2, score 3 for id 2

Third Reference
Mr G, Mr H, Mr I
score 1 for id 3, score 2 for id 3, score 3 for id 3

Fourth Reference
Mr J, Mr K, Mr L
score 1 for id 4, score 2 for id 4, score 3 for id 4

I wrote my code and I am able to get the value of title in correct format but I am not able to get the author information specifically for each entry.

import xml.etree.ElementTree as ET
document = ET.parse("recipe.xml")
root = document.getroot()
title=[]
author=[]
score=[]  

for I in root.getiterator('title'):
     title.append(i.text)
     for j in root.getiterator('author'):
          author.append(j.text)
           for k in root.getiterator('score'):
                score.append(k.text)

for i,j,k in zip(title,author,score):
      print i,j,k

xml python • 1.9k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 8.5 years ago by khanshbz1605 • 0