Question: How To Get "Journals" Information From Genbank Entry Using Python
0
gravatar for fm271
5.5 years ago by
fm27120
fm27120 wrote:
LOCUS       AAW04511                  12 aa            linear   PAT 15-DEC-2004
DEFINITION  Sequence 89 from patent US 6790938.
ACCESSION   AAW04511
VERSION     AAW04511.1  GI:56612658
DBSOURCE    accession AAW04511.1
KEYWORDS    .
SOURCE      Unknown.
  ORGANISM  Unknown.
            Unclassified.
REFERENCE   1  (residues 1 to 12)
  AUTHORS   Berchtold,P. and Escher,R.F.A.
  TITLE     Anti-GPIIb/IIIa recombinant antibodies
  JOURNAL   Patent: US 6790938-A 89 14-SEP-2004;
            ASAT AG Applied Science & Technology; Zug;
            DEX;
  REMARK    CAMBIA Patent Lens: US 6790938
FEATURES             Location/Qualifiers
     source          1..12
                     /organism="unknown"
ORIGIN      
        1 gsgsylgyyf dy
//

In the above genbank entry, how can I get "journal" and "remark" information present in "REFERENCE". I can access authors and title but not journal and remark information.

from Bio import Entrez, SeqIO
handle = Entrez.efetch(db="protein", id="AAW04511",rettype="gb")
seq_record = SeqIO.read(handle, "genbank")
seqAnn = seq_record.annotations
seqAnn['references'][0].title
seq_record.annotations['references'][0].authors

Any help will be appreciated.

python biopython • 2.1k views
ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by fm27120

The line beginning "seqAnn" seems to be irrelevant to your problem.

ADD REPLYlink written 5.5 years ago by Neilfws48k

sorry, I forgot to include the line. Edited. but Peter has already answered this.

ADD REPLYlink modified 5.5 years ago • written 5.5 years ago by fm27120
3
gravatar for Peter
5.5 years ago by
Peter5.8k
Scotland, UK
Peter5.8k wrote:

Thank you for posting a self contained example :)

>>> from Bio import Entrez, SeqIO
>>> Entrez.email = "Your.Name@example.org"
>>> handle = Entrez.efetch(db="protein", id="AAW04511",rettype="gb")
>>> seq_record = SeqIO.read(handle, "genbank")
>>> seq_record.annotations['references'][0].authors
'Berchtold,P. and Escher,R.F.A.'
>>> seq_record.annotations['references'][0].title
'Anti-GPIIb/IIIa recombinant antibodies'
>>> seq_record.annotations['references'][0].journal
'Patent: US 6790938-A 89 14-SEP-2004; ASAT AG Applied Science & Technology; Zug; DEX;'
>>> seq_record.annotations['references'][0].comment
'CAMBIA Patent Lens: US 6790938'

I'm surprised you didn't guess it was just .journal given you'd found .title and .authors fine. Here's a useful tip for exploring a new data structure in Python is the dir(...) function will list all the attributes and methods (for now ignore all the ones starting with an underscore):

>>> dir(seq_record.annotations['references'][0])
[..., 'authors', 'comment', 'consrtm', 'journal', 'location', 'medline_id', 'pubmed_id', 'title']
ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Peter5.8k

Thanks for reply. I remember I tried this but I might be doing something wrong.

ADD REPLYlink written 5.5 years ago by fm27120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 674 users visited in the last hour