Question: (Closed) parsing data from xml in python
0
gravatar for khanshbz1605
3.1 years ago by
United States
khanshbz16050 wrote:

I have a xml file:

    <swissprot created="2010-12-20">
     <entrylevel dataset="abc">
        <references id="1">
            <title>first references</title>
            <author>
                <person name="Mr. A"/>
                <person name="Mr. B"/>
                <person name="Mr. C"/>
            </author>
            <score> score1 for id 1 </score>
            <score> score2 for id 1 </score>
            <score> score3 for id 1 </score>
        </references>
        <references id="2">
            <title>Second references</title>
            <author>
                <person name="Mr. D"/>
                <person name="Mr. E"/>
                <person name="Mr. F"/>
            </author>
            <score> score1 for id 2 </score>
            <score> score2 for id 2 </score>
            <score> score3 for id 2 </score>
        </references>
        <references id="3">
            <title>third references</title>
            <author>
                <person name="Mr. G"/>
                <person name="Mr. H"/>
                <person name="Mr. I"/>
            </author>
            <score> score1 for id 3 </score>
            <score> score2 for id 3 </score>
            <score> score3 for id 3 </score>
        </references>
        <references id="4">
            <title>fourth references</title>
            <author>
                <person name="Mr. J"/>
                <person name="Mr. K"/>
                <person name="Mr. L"/>
            </author>
            <score> score 1 for id 4 </score>
            <score> score 2 for id 4 </score>
            <score> score 3 for id 4 </score>
        </references>
      </entrylevel>
    </swissprot>  

I want the all references from this xml in a specific format:
    Output:

    First Reference
    Mr A, Mr B, Mr C
    score 1 for id 1, score 2 for id 1, score 3 for id 1

    Second Reference
    Mr D, Mr E, Mr F
    score 1 for id 2, score 2 for id 2, score 3 for id 2

    Third Reference
    Mr G, Mr H, Mr I
    score 1 for id 3, score 2 for id 3, score 3 for id 3

    Fourth Reference
    Mr J, Mr K, Mr L
    score 1 for id 4, score 2 for id 4, score 3 for id 4

I wrote my code and I am able to get the value of title in correct format but I am not able to get the author information specifically for each entry.

    import xml.etree.ElementTree as ET
    document = ET.parse("recipe.xml")
    root = document.getroot()
    title=[]
    author=[]
    score=[]  

    for i in root.getiterator('title'):
         title.append(i.text)
         for j in root.getiterator('author'):
              author.append(j.text)
               for k in root.getiterator('score'):
                    score.append(k.text)

    for i,j,k in zip(title,author,score):
          print i,j,k

   

xml python • 991 views
ADD COMMENTlink written 3.1 years ago by khanshbz16050
1

Hello khanshbz1605!

We believe that this post does not fit the main topic of this site.

The author isn't author.text, it's person.get("name"). You need to iterate over the author children. Anyway, this isn't a bioinformatics question.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Devon Ryan86k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2064 users visited in the last hour