I hacked up a simple algorithm to estimate the volume of small molecules, and for such a simple approach it does quite well. I was wondering how good this method would work for protein. Is the volume of protein much used in bioinformatics? If so, what existing methods are available? I'm hoping they also come with some data set, so that I can validate if this simple approach applies to protein too.
This is commonly done based on published amino acid volumes in small angle scattering and seems good enough for most purposes. Dont have references to hand but could provide something early next week probably...
Another option is to use one of the several amino acid indices from AAINDEX to derive the volume, but as mentioned by Rajarshi - this may not be appropriate due to secondary / tertiary conformations of protein structures.
I am not a true expert here, but I would caution that the term 'volume' for a protein molecule can be ambiguous. As you certainly know, protein structures may contain interncal cavities and deep crevices, which might or might not be accessible to a reporter (and might thus be considered as 'inside' or 'outside' the volume. I am more familiar with the related issue of protein surface calculation, where most programs offer different algoriths for the different concepts of what a surface is (van der Waals, solvent accessible, etc). If the surface is ambiguous, so is the volume.
Which idea of a volume is the appropriate one depends on what you plan to do with the volumne (density calculation? Displacement of solvent? hydrodynamic properties? dense packing ?)
For your method to be able to scale up to big proteins you need to take into account things like internal cavities and the like, that most algorithms neglect. The best approach is Unionball (http://www.ncbi.nlm.nih.gov/pubmed/21823134)