Question: Biopython retrieve atom coordinates and residue name
0
gravatar for underoath006
19 months ago by
underoath0060 wrote:

Hello,

I would like to use biopython to retrieve the atom coordinates of type CA and the name of it's residue.

The input is a .pdb file. The output a dictionary with key = x,y,z and value = residue name. Since I'm completely new to biopython, I'm still to read the biopython tutorial, but for now, i would appreciate if someone scripts this for me.

Sincerest regards.

biopython python pdb • 1.7k views
ADD COMMENTlink modified 19 months ago by IP430 • written 19 months ago by underoath0060
2
gravatar for IP
19 months ago by
IP430
Denmark/University of Copenagen
IP430 wrote:

As you have said, you need to read the biopython documentation, which has a class called PDB and it is well documented with examples. In section 8.3 of the docs it is well explained how to iterate through a structure object.

p=PDBParser()
structure=p.get_structure(’X’, ’pdb1fat.ent’)
for model in structure:
    for chain in model:
        for residue in chain:
           for atom in residue:
               print atom

Then, if you want to select the C-alpha element you could easily do that as following

residue['CA']

Finally, in order to get the the x,y,z coordinates of the C-alpha atom you could use the get_vector() method.

residue['CA'].get_vector()
ADD COMMENTlink written 19 months ago by IP430

Thank you. Two things. First, my protein contains a ZN atom and gives a KeyError: 'CA' with the code below. BTW, does that mean biopython is still buggy? The second thing is that the output contains a lot of weird formatting "<vector 1.34,="" 0.22,="" -0.72="">", I only want the numbers. I also want to retrieve the residue letter of that CA atom I'm operating on.

from Bio.PDB import *
import numpy as np
CA_coordinates = np.array([])
p=PDBParser()
structure=p.get_structure('name', '1dsq_n.pdb')
for model in structure:
    for chain in model:
        for residue in chain:
            CA_coordinates = np.append(CA_coordinates, residue['CA'].get_vector())
ADD REPLYlink modified 19 months ago • written 19 months ago by underoath0060

I think that you should read the documentation (if you have read it, do it again) before asking about how to get the residue, you will find the useful information there. About your problem of ZN atoms, I will go for a try-catch example:

try:
     "piece of code where you try to get the CA atom"
except:
    "what to do if a exception raises, for example pass"
ADD REPLYlink modified 19 months ago • written 19 months ago by IP430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2165 users visited in the last hour