Question: Parsing error using Entrez.read (BioPython)
0
gravatar for bojingjia
5.0 years ago by
bojingjia10
United States
bojingjia10 wrote:

After finding the corresponding gene IDs to a list of gene symbols, I am trying to grab the gene summary for those gene IDs. When I use Entrez.read to parse the summaries I have grabbed (using Entrez.esummary), I get a weirdly-structured list/dictionary for which I can't make out the keys. For example, below I try to print out the values under "OtherDesignations", and I get a key error. Can someone help me out?


import sys
from Bio import Entrez
import xlrd

Entrez.email = "john.doe@mail.com"

wb = xlrd.open_workbook('C:/Users/user/geneSymbolsTest.xlsx')
sh = wb.sheet_by_index(0)
colA = sh.col_values(0)
colA.pop(0)

symbol_list = []
for x in colA:
symbol_list.append(str(x))

id_list = []
summary = []
parsedSummary = []

for x in symbol_list:

    sterm = x + '[sym] "Mus musculus"[orgn]'
    handle = Entrez.esearch(db="gene", retmode = "xml", term = sterm )
    record = Entrez.read(handle)

    IDArray = record["IdList"]
    toString = str(IDArray[0])
    summary = Entrez.esummary(db="gene", retmode = "xml", id = toString)
    parsedSummary = Entrez.read(summary)
    entry = parsedSummary[0]["OtherDesignation"]
    print entry

 

biopython entrez gene • 1.3k views
ADD COMMENTlink modified 4.9 years ago by glihm620 • written 5.0 years ago by bojingjia10

Hi, can you please add an "x" example, in order to test your code as you did. 

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by glihm620
0
gravatar for glihm
4.9 years ago by
glihm620
France
glihm620 wrote:

Bojingjia,

To enter in the first dictionary which contains OtherDesignation, you did a little mistake:

entry = parsedSummary["DocumentSummarySet"]["DocumentSummary"][0]["OtherDesignations"]
print entry

​When you have a complicated output like this, if you are looking for to extract the data from one specific field, you should try to identify each data structure which contains it. I am agree with you, sometimes it is a little bit... fastidious. Next point, you have to print each step to see the evolution of your data structure. It will be helpful to identify these ones.

ADD COMMENTlink modified 8 months ago by RamRS27k • written 4.9 years ago by glihm620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1317 users visited in the last hour