Question: Syntax For Neighborsearch Module In Biopython
gravatar for satsurae
7.5 years ago by
satsurae120 wrote:


I'm having trouble with using this module, more specifically the 'search' function. How do i get the atom list from a pdb file (is this using PDBParser)? How do I use it to find the nearest residues from the co-crystallized ligand?


pdb biopython parsing • 3.5k views
ADD COMMENTlink modified 18 days ago by jma.bullock0 • written 7.5 years ago by satsurae120

Can you show us the code you already have got?

ADD REPLYlink written 7.5 years ago by Michael Schubert6.7k
gravatar for Michael Schubert
7.5 years ago by
Cambridge, UK
Michael Schubert6.7k wrote:

The basic syntax of NeighborSearch is the following. If you tell me what you want to do exactly I can give you a more precise hint. Also, see the Bio.PDB FAQ (pdf).

The script basically loads the local pdb structure file molecule.pdb, creates a list of all atoms, gets the coordinates of the first atom, and prints a list of all resides within 5 angstrom of it (source).

from Bio.PDB import *
import sys
structure = PDBParser().get_structure('X', 'molecule.pdb') # load your molecule
atom_list = Selection.unfold_entities(structure, 'A') # A for atoms
ns = NeighborSearch(atom_list)
center = atom_list[0].get_coord()
neighbors =, 5.0) # 5.0 for distance in angstrom
residue_list = Selection.unfold_entities(neighbor_list, 'R') # R for residues
print residue_list

If you are interested in the neighbors of just a few atoms/residues, you might want to consider PyMOL since it has a nice set of GUI features that are more easily accessible.

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by Michael Schubert6.7k

Thanks very much for the reply. Basically, what I want to do is find the nearest neighbors to the ligand bound in a protein. I need to do this for quite a few (over 200 pdbs) hence i wanted to use a script. So in this test case, the ligand has a resseq=600 and the chain starts at resseq=307. Does this mean in center = atom_list[0].get_coord() the 0 would be 293? Thanks again

ADD REPLYlink written 7.5 years ago by satsurae120

No, because in one case you have a list of atoms and in the other residue IDs. In the above example, you could get all residues (instead of atoms) first, search for your ligand there, and then get the corresponding ligand atoms and perform NeighborSearch.

ADD REPLYlink written 7.5 years ago by Michael Schubert6.7k

Hi I have large number of hetero-dimeric proteins. I need to check for all atoms of chain A for which atoms of chain B are within 10Å and need to obtain the list of residue numbers of those atoms as output. I am new to programming and I do not know how to write the code for this using the NeighborSearch module. Would you be able to help me out?

ADD REPLYlink modified 22 months ago • written 22 months ago by ramachandranakash19940
gravatar for jma.bullock
18 days ago by
jma.bullock0 wrote:

here is a script that calculates the fnat using modeller - and it is a bit more straightforward to tease apart the ligand and receptor (it should all work automatically, no messy renumbering stages). it is however quite slow, but maybe you will find it useful ... !

from modeller import *
from modeller.scripts import complete_pdb
import sys,os
import math
import numpy as np

def calculate_fnat(model,cut_off):
    fnat = {}
    for a in model.chains:
        for b in model.chains:
            if a != b:
                sel_a = selection(a)
                sel_b = selection(b)
                for ca_a in sel_a:
                    ax = ca_a.x
                    ay = ca_a.y
                    az = ca_a.z
                    for ca_b in sel_b:
                        bx = ca_b.x
                        by = ca_b.y
                        bz = ca_b.z

                        dist = math.sqrt(((ax-bx)**2)+((ay-by)**2)+((az-bz)**2))

                        if dist <= cut_off:
                            a_res  =  ca_a.residue
                            b_res  =  ca_b.residue
                            if (int(b_res._num)+1,,int(a_res._num)+1, not in fnat:
                                if (int(a_res._num)+1,,int(b_res._num)+1, not in fnat:
                                    fnat[int(a_res._num)+1,,int(b_res._num)+1,] = dist 
    return fnat

# initialise the environment .
env = environ()'${LIB}/top_heav.lib')'${LIB}/par.lib')

#calculate native contacts
m1 = complete_pdb(env, 'native.pdb')
rmsd_sel = selection(m1).only_atom_types('CA')

fnat = calculate_fnat(m1,5)

filenames = []

for pdb_file in [l for l in os.listdir("./") if l.endswith(".pdb")]:
    print pdb_file

results = []    
# calculate model contacts                      
for fname in filenames:
    m2 = complete_pdb(env, fname)

    fnat_mod = calculate_fnat(m2,5)   

    count = float(0)

    #compare native contacts to model contacts
    for f in fnat:
        if f in fnat_mod:
            count += 1

    score = count/len(fnat)


#output results
f1 = open('data_4.txt','w')

for line in results:
    f1.write('%s\t%s\n' % (line[0],line[1]))

ADD COMMENTlink written 18 days ago by jma.bullock0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 616 users visited in the last hour