ZDOCK Benchmark PDB files have unusual format
1
0
Entering edit mode
4.0 years ago
james ▴ 20

I am new to working with PDB format files, and I am having difficulty working with the ZDOCK Benchmark files.

Their input PDBs for generating decoys seem to have 2 extra columns, and the filenames end with *.pdb.ms

Does anyone know what type of files these really are?

The output decoy PDBs generated by their software maintain these extra columns. For example:

ATOM      1  N   GLU A   6      72.093  26.103  78.886  8     1 1.63         -0.15
ATOM      2  CA  GLU A   6      71.909  24.863  78.143  8     1 2.03          0.10
ATOM      3  C   GLU A   6      70.753  24.029  78.676  8     1 1.67          0.60
ATOM      4  O   GLU A   6      70.717  23.551  79.806  8     1 1.38         -0.55

Column 10 (with the integers 8) and the last column do not seem to be usual PDB fields. Can I just ignore these columns, and create a standard PDB?

next-gen • 1.1k views
ADD COMMENT
0
Entering edit mode

Sorry for the late reply. I thought I would be notified by email if anyone responded.

The problem is that in the example I gave, the fields do not match the ATOM specification. They only approximately correspond. Here is what a pdb file downloaded with pdb-tools looks like. It matches the ATOM specification exactly:

ATOM      1  N   LYS A   4      28.189   5.020  62.680  1.00 68.66           N  
ATOM      2  CA  LYS A   4      27.705   5.368  64.017  1.00 67.66           C  
ATOM      3  C   LYS A   4      26.198   5.204  64.109  1.00 64.00           C  
ATOM      4  O   LYS A   4      25.398   5.669  63.303  1.00 63.53           O

You can see the difference between the ZDOCK files I was posting about, and the "legit" pdb format.

ADD REPLY
0
Entering edit mode

I already said what I thought was pertinent to your problem, but I will repeat:

nothing that is in occupancy field and beyond (starting with column #55) should break them.

After XYZ atom coordinates (starting with character column #55), many programs put information into their PDB output that may or may not be according to specification. This shouldn't matter, as most programs either completely ignore everything that comes after XYZ atom coordinates, or at least do not rely on that information in any serious way. Unless you tried loading ZDOCK's PDB files and that failed somehow, I think you should not worry about anything beyond character column #55 (or anything beyond 9th column overall).

ADD REPLY
0
Entering edit mode

Ah, thank you. I didn't read the original answer carefully enough.

Unfortunately, the program I am trying to use: https://github.com/kiharalab/DOVE, exits with an error depending on what is after the z-coordinate. I will look into the code to try and determine whether it actually relies on the information...

ADD REPLY
0
Entering edit mode

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLY
1
Entering edit mode
4.0 years ago
Mensur Dlakic ★ 27k

These are not "extra" columns, though they may not have the information that is normally there. The explanation of all fields in ATOM lines is here. Some programs read the information beyond the 9th column (z-coordinate) and others do not, but nothing that is in occupancy field and beyond (starting with column #55) should break them. Unless you have some reason that is not obvious to me, you are probably OK leaving the file as it is.

PS An older guide to PDB format is here.

ADD COMMENT

Login before adding your answer.

Traffic: 1846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6