Question: Find mapping of indices of amino acid index in PDB files and sequence
gravatar for JJP
5 months ago by
JJP0 wrote:

Hi All,

I am a beginner in Biopython. What I am trying to do is the following:

I have a sequence of amino acids (including gaps)and a corresponding PDB file. The numbering of amino acids in the PDB file does not match the numbering of the amino acids in the sequence list. I want to find the index of each amino acid entries in the PDB file and find the corresponding number in the sequence. For example, if the first entry in the PDB file is Alanine, I want to find the corresponding index of Alaline in the sequence list. Also, for gaps (-), I want to set the index as zero.

Here is the sequence list I have:


Here is what I have tried so far:

import pylab as pyl
import numpy as np
import sys
import os
import re
import argparse

def parseArgs():
"""Parse command line arguments"""

   parser = argparse.ArgumentParser(
   description = 'Read and extract items from input PDB file')

                    help='input PDB file in standard format')

 print ("An exception occurred with argument parsing. Check your provided options.")

 return parser.parse_args()

 # Reads a PDB file and returns the residue name and coordinates for 
 # each C-alpha atom
 # (the input argument for this routine is the pdb file name.)

def get_coordinates_PDB(File_In):
      fl = open(File_In,'r')
  print('Could not open input file {0}'.format(File_In))
  Res = []
  Points = []

 #Getting from a PDB file

for line in fl:
  if not(line.startswith('ATOM')):
elif (line[13:15] != 'CA'):
resname = line[17:20]
xyz = re.findall('[-+]?\d+\.\d+', line)
tmp = np.zeros(3)
tmp[0] = float(xyz[0])
tmp[1] = float(xyz[1])
tmp[2] = float(xyz[2])
return Points, Res

def main():
 """Read and parse a provided PDB file."""

#Parse arguments
 args = parseArgs()

 File_In = args.input


if __name__ == '__main__':

This outputs the x,y,z coordinates and the amino acids in the PDB file. However, I am stalled at this point.

I would much appreciate if someone could help me with implementing the rest. Thank you in advance for your time and help!

sequence python pdb • 382 views
ADD COMMENTlink modified 5 months ago by natasha.sernova3.5k • written 5 months ago by JJP0

There was a post several weeks ago. It may be useful to you.

Using STDIN with BioPython's PDB methods

ADD REPLYlink modified 5 months ago • written 5 months ago by natasha.sernova3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 801 users visited in the last hour