How to obtain side chain atoms from PDB file?
1
0
Entering edit mode
6.4 years ago
m.taheri ▴ 50

Hello every body.

I need to obtain coordinates of sidechain atoms from pdb file. Are sidechain atom coordinates stored in pdb files? How can i retrieve them?

protein sidechain atom pdb • 4.2k views
1
Entering edit mode
6.4 years ago
alolex ▴ 940

If the PDB was submitted with side chain atoms then they will be listed under the ATOM tags in the PDB file.  All atoms for a residue are listed in the same chunk with the same amino acid abbreviation.  Review the PDB file format here (ftp://ftp.wwpdb.org/pub/pdb/doc/format_descriptions/Format_v33_Letter.pdf).  To get these coordinates you can just download the PDB file and open it up as a text file for manual browsing or parsing via a script.

0
Entering edit mode

Thank you. Are all pdb files in the protein data bank submitted with side chain atoms? I want to write a script to extract side chain atoms from a standard pdb file. For example there is an alanine residue with following atoms:

ATOM     55  N   ALA A   9       5.606   4.546  11.941  1.00  3.73           N
ATOM     56  CA  ALA A   9       5.598   5.767  11.082  1.00  3.56           C
ATOM     57  C   ALA A   9       6.441   5.527   9.850  1.00  4.13           C
ATOM     58  O   ALA A   9       6.052   5.933   8.744  1.00  4.36           O
ATOM     59  CB  ALA A   9       6.022   6.977  11.891  1.00  4.80           C


I know that an alanine's side chain includes one carbon atom and three hydrogen atoms. So, which atoms from the atoms listed above should be selected? I think atom 57 is carbon. But where are hydrogen atoms?

0
Entering edit mode

Yes, atoms 56, 57 and 59 are all carbon atoms.  See this website: (http://www.umass.edu/molvis/decatur/pe2.727/protexpl/help_hyd.htm).  X-ray Crystallography generated PDB files will never have hydrogens because the resolution is not strong enough; however, PDB files generated from NMR should always have hydrogen atoms.  Thus, you may want to filter all the PDB files by method before running your script if it is the hydrogen atoms you are interested in.  Additionally, be aware that some PDB files will only contain the backbone coordinates, like those generated from computation modeling and prediction algorithms.  So filtering on the method of acquisition should get you a list of PDB files with side chains and hydrogen atoms that you can parse; however, I can't guarantee that will be the case 100% of the time.

0
Entering edit mode

Thanks a lot. May you please give me a reference to understand abbreviated atom names? For example N, H, O, HA, HB1, HB2 and...

0
Entering edit mode

In the last column of your first comment above are the actual atom symbols as defined on the periodic table.  In the third column where you see the CA and CB etc, these are the PDB atom names--the letters/numbers increase as you get further away from the backbone.  CA is the alpha carbon that is part of the protein backbone, and CB is the first carbon atom of the side chain.  These names are the way each atom in a residue can be uniquely identified, and they are standard across the PDB.  Check this page out for a few more details (http://haldane.bu.edu/needle-doc/new/atom-format.html), or this one (http://www.biostat.jhsph.edu/~iruczins/teaching/260.655/links/pdbformat.pdf).  You can also search for "PDB Atom nomenclature" in Google and see if anything helpful turns up.

0
Entering edit mode

Thank you so much.