I would like to add the consensus numbering, as determined by Andrew Smith's Abnum utility, to a Biopython Seq object. After browsing the Biopython Tutorial and Cookbook and Bio module source for a while, and several hours of Google searching over a few days, I haven't been able to find a clear answer on whether this is possible.
The end goal is to create a Django web app to compare various antibody sequences with their germline counterparts and, eventually, structural information. For this, it will be helpful to have not only the index of the residue in the sequence, but the Kabat (or Chothia, etc.) position as well.
For example, the Abnum output for an antibody light chain with the amino acid sequence SYVLTQPPSVS...
looks like this:
L1 S
L2 Y
L3 V
L4 L
L5 T
L6 Q
L7 P
L8 P
L9 S
L10 -
L11 V
L12 S
...
The trouble lies in the gaps (e.g. L10 -
) and insertions (e.g. H99, H100, H100A ... H100G, H101
). To properly refer to a residue (particularly when using the structural info), I need to be able to use its consensus number.
Ideally, this would be an attribute added to the individual residue of the sequence:
from Bio.Seq import Seq
from Bio.Alphabet.IUPAC import IUPACProtein
# create the sequence object
s = Seq('SYVLTQPPSVS...', IUPACProtein)
# after reading the Abnum output into an array,
# loop through both the Abnum array and the sequence,
# perhaps something like this, for starters
i=0
for line in abnum_output:
arr = line.split(' ')
if arr[1] != '-':
s[i].kabat_number = arr[0]
i++
What I haven't been able to find is a way to add such an attribute to a single residue, rather than to the entire chain. I'm hoping someone out there might have a solution. Any help will be greatly appreciated! Thanks.
Thanks for the tip, @Peter -- I think that's just what I was after. Much appreciated!
One quick edit: it seems your con_numbers list should have "L11" and "L12" as the last numbers for this particular example...the
letter_annotations
are assigned to the gap positions as well.)I was reproducing your example - I take it you wanted something a little different? Well anyway, the same basic idea should work.