I'm trying to calculate solvent accessible surface area using Biopython.
However, the values from biopython do not make sense when looking at the crystal model visually in Pymol. Furthermore, when I calculate the SASA using Pymol, those values match well visually, but do not agree with the biopython values.
Why is biopython producing supposedly wrong SASA values?
Here are the values given by Biopython (~180 is max):
[('GLY', 3.4), ('ILE', 0.0), ('VAL', 0.0), ('GLU', 4.25), ('GLN', 0.85), ('CYS', 0.0), ('CYS', 0.0), ('THR', 45.57), ('SER', 55.2), ('ILE', 42.79)]
You can see that Gly1 has a tiny value. However, looking at the model visually (see picture, surface in blue), Gly 1 has a lot of exposed surface.
Also, the calculated value using pymol is 67.44. This shows some exposure. Why is the biopython code so different and seemingly wrong?
Biopython code:
from Bio.PDB.SASA import ShrakeRupley
from Bio.PDB.MMCIFParser import MMCIFParser
parser = MMCIFParser(QUIET=True)
structure = parser.get_structure("HELLO", "insulin.cif")
sr = ShrakeRupley()
sr.compute(structure[0], level="R")
my_list = []
for chain in structure[0]:
for res in chain:
my_list.append((res.get_resname(),round(res.sasa,2)))
print(my_list[0:10])
BioPython can use
dssp
to calculate SASA - see here. It outputs relative accessibility, so you would also have to read in the residue and multiply by the numbers I posted above.These are relative values (in percent) for the same 10 residues as above, and they are in the ballpark:
Hi Menur Dlakic,
Wow, thank you so much for your detailed help. Given that DSSP using Biopython seems a good bet I'll proceed with that.
Thanks again