Biopython DSSP "unexpected sidechain number" Warning
3 months ago
sajid ▴ 20

I am trying to retrieve residue-specific secondary structure and solvent accessibility related information from a bunch of PDB files using DSSP program from Biopython. However, I am getting a weird warning about some residues having more sidechain atoms than expected. A minimum working code for reproducing the warning for PDB ID: 5PTI is as follows:

 from Bio.PDB import *
parser = PDBParser(QUIET=True)

structure = parser.get_structure(id='struct',file="5pti.pdb")
model = structure[0]
dssp = DSSP(model,"5pti.pdb")


For this specific PDB ID, for example, the first warning is, "Residue ARG 1 A has 11 instead of expected 7 sidechain atoms". I have checked the number of atoms for all the arginine residues in this PDB file using the following code:

structure = parser.get_structure(id='struct',file="5pti.pdb")
model = structure[0]
residues = Selection.unfold_entities(model, 'R')
count = 0

for idx,residue in enumerate(residues):
if residue.get_resname() == "ARG":
atoms = Selection.unfold_entities(residue,'A')
print(len(atoms),atoms[len(atoms)-1],residue.get_id()[1])
count = count + 1

print(count)


Can someone please point me towards the reason of this warning? Can I ignore it safely for my purpose? Thanks.

I don't get that warning with BioPython 1.76.

Thanks for this pointer. My BioPython version is 1.79. Since you are not getting this warning, I think I should be good to proceed while ignoring this warning. Does that sound like a sensible thing to you?

I don't think this will affect secondary structure assignments, but it could affect the calculation of solvent accessibility. I really don't know because I have always used DSSP directly rather than via BioPython.

It may be a good idea to remove the hydrogens because they are not needed by DSSP. That could also take care of the warning. There is a program called reduce which you can find here. It is meant for adding hydrogens, but it can do the opposite.

reduce -Trim 5pti.pdb > 5pti_noH.pdb

Thanks to your suggestion of removing the hydrogens, the warning is now gone. I have used biopython for removing the hydrogens. I am putting the script here so that it might help someone in the future.

for model in structure:
for chain in model:
for residue in chain:
atoms_to_remove = []
for atom in residue:
if atom.element == "H":
atoms_to_remove.append(atom.name)
for atom in atoms_to_remove:
residue.detach_child(atom)

io = PDBIO()
io.set_structure(structure)
io.save("X_no_hydrogen.pdb")

Thanks a lot. I will try to remove hydrogens and will let you know what happens after that. I am extremely grateful for your help.