How to fix white space in PDB file so Pymol can read it properly (i.e. Pymol cannot read TSV)
2
0
Entering edit mode
15 months ago
Sam • 0

Hello,

I have come to realize, pymol for some reason cares a lot about white spaces. Modifying the elements of a single "ATOM" line and changing it's white space completely breaks Pymols ability to read it.

E.G. Pymol has no issue reading this

ATOM   6219  OXT LEU B 380.0    53.326  48.908  20.774  1.00 66.40       O

But cannot read this

ATOM    6219    OXT LEU B   380.0   53.326  48.908  20.774  1.00    66.40   O

The bottom line is TSV, the top line has, from what I can tell, a bunch of random white spaces. Is there any tool or way to convert the white spaces from TSV to whatever format Pymol likes?

Pymol python • 1.6k views
ADD COMMENT
3
Entering edit mode

Looking at a PDB file I see the following non-printing characters. They are not random but pad up columns so they are always aligned.

PDB

Further down in the file we see

PDB2

So you will need to make sure that tabs you replace with whitespace pads up to the same number of characters in a line and the columns stay aligned.

Edit: I just tested a modified file and it worked in PyMOL. BTW there are no tabs in PDB file I got from PDB in ATOM section.

ADD REPLY
1
Entering edit mode

Check the non-printing characters in your file (something like cat -vet should work) and then replace them with tabs. There may be a PDB format validation program out there that you could use as well.

ADD REPLY
0
Entering edit mode

I'm not trying to replace the white spaces with tabs. I'm trying to convert the tabs into the white spaces Pymol likes. The issue is, from what I can tell, the white spaces are random and different for each element in the line. I'm sure there is some pattern the program likes, but I don't know what that pattern is. I'm hoping someone either has the tool that will convert TSV into whatever pattern the program likes, or can tell me so I can quickly convert it myself. Or maybe there is an option in pymol that enables you to read TSV pdbs

ADD REPLY
4
Entering edit mode
15 months ago
Mensur Dlakic ★ 27k

PDB files have a strict format which is described here. Programs that read PDB files expect this format and will be inflexible even to small deviations from it.

ADD COMMENT
0
Entering edit mode

thank you, this is exactly what I was looking for! Do you know anything that can convert a TSV pdb into one with the proper spacing (I can make my own, just curious if it already exists and thus don't need to reinvent the wheel).

Edit: This holds true for any delimitation (csv, tsv, space, etc.).

ADD REPLY
2
Entering edit mode
15 months ago
Michael 54k

The easiest way to get there is to grab any language that has support for the sprintf C function (that is about any language after Turbo Pascal and Commodore BASIC), parse the tab-separated file into an internal representation, and write out the columns line by line using the format definition given here: https://www.giorginolab.it/blog/pdbatomformatforsprintf

fmt="%-6s%5d %-4s%1s%-3s %1s%4d%1s   %8.3f%8.3f%8.3f%6.2f%6.2f      %-4s%2s%-2s";

In python, this could (I'm no python programmer) look somehow like this (no checks, no boilerplate):

fmt = "%-6s%5d %-4s%1s%-3s %1s%4d%1s   %8.3f%8.3f%8.3f%6.2f%6.2f      %-4s%2s%-2s"
for line in file: printf(fmt, line.split()) # ofc you might do some sanity checking here
ADD COMMENT
0
Entering edit mode

Yeah this is basically what I did. Using the format that Mensur had linked

ADD REPLY

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6