Question: How to edit .PDB files fast ?
1
gravatar for ReWeeda
3.2 years ago by
ReWeeda50
University of Bologna
ReWeeda50 wrote:

Good Morning,

I'm a student of Bioinformatics from the University of Bologna.

I'm working on a personal idea and I'm using MUSTANG to perform multiple structural alignment. To run the algorithm .pdb file with just one chain are required so I'm looking for tools or a suggestion in order to edit in a clever and fast way all the files.

'til now I wrote some command lines in python but I think that they're not enough specific to correctly edit all the files in fact I often have to check manually all the structures again.

someone could help me?

thanks in advance! Dade.

editing pdb • 2.5k views
ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by ReWeeda50
3
gravatar for gearoid
3.2 years ago by
gearoid200
gearoid200 wrote:
"""
Extract a single chain from a PDB file
"""

from __future__ import print_function
import Bio.PDB
import Bio.PDB.PDBIO
import sys
import argparse


class ChainSelect(Bio.PDB.Select):
    def __init__(self, target_chain):
        self.target_chain = target_chain
    def accept_chain(self, chain):
        if chain.get_id() == self.target_chain:
            return 1
        else:
            return 0

def main():
    argparser = argparse.ArgumentParser(description="Extract chain from a PDB file")
    argparser.add_argument('infile', help="Path to input file (PDB)")
    argparser.add_argument('chain', help="Chain to extract")
    argparser.add_argument('outfile', help="Path to output file (PDB)")
    args = argparser.parse_args()

    pdbparser = Bio.PDB.PDBParser()
    io = Bio.PDB.PDBIO()
    with open(args.infile, 'r') as infile:
        struct = pdbparser.get_structure(args.infile, infile)
        io.set_structure(struct)
    with open(args.outfile, 'w') as outfile:
        io.save(outfile, ChainSelect(args.chain))
    return 0

if __name__=="__main__":
    sys.exit(main())

Run it with something like:

python extract_chain.py 1XXX.pdb A 1XXXA.pdb

to extract just chain A from 1XXX.pdb.

ADD COMMENTlink written 3.2 years ago by gearoid200

Using UNIX commands

grep '^ATOM' 1XXX.pdb | awk '$5=="A"' > 1XXXA.pdb
ADD REPLYlink written 3.2 years ago by venu6.2k

As far as I know, the PDB file format defines the fields based on specific columns rather than whitespace. So previous fields in the atom record aren't guaranteed to be separated by whitespace. So this command will seem like it's working, but silently fail to extract certain atoms from the chain you want.

You can see in the example here that some of the atoms have the chain in $4 and some in $5: http://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM

ADD REPLYlink written 3.2 years ago by gearoid200

you're right.

Nevertheless I've adopted another solution merging different ideas retrieved on the web. Now I've just to write a shell script to run my .py over all the structures with just one click.

ADD REPLYlink written 3.2 years ago by ReWeeda50
0
gravatar for ReWeeda
3.2 years ago by
ReWeeda50
University of Bologna
ReWeeda50 wrote:

It works perfectly! Thanks!

ADD COMMENTlink written 3.2 years ago by ReWeeda50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1258 users visited in the last hour