Question

Calculate Mw In Kda From Protein Sequence

0

Entering edit mode

11.9 years ago

kajendiran56 ▴ 120

thank you for taking the time to look at my post. I was wondering if there is anyone who could help me to calculate the molecular weight of a protein from the protein sequence, ideally manually. I heard that for a rough estimate the following can be used:

No. of amino acids * 110 (i.e. the average mol wt of an amino acid) - No of amino acids * 18 (i.e mol wt of water)

I am also aware that there are numerous web servers available to do this, however, I have many hundreds of thousands of sequences and I want to incorporate this into my programs which performs other calculations.

If you cannot help with this, could you suggest an accurate stand alone program that I can use via a commandline call to do this?

Many thanks for your time.

protein sequence molecular • 16k views

ADD COMMENT • link updated 11.9 years ago by Gareth Morgan ▴ 310 • written 11.9 years ago by kajendiran56 ▴ 120

1

Entering edit mode

This problem should be simple enough to solve it yourself, given that you know some very basic programming. How about counting each residue type and multiplying by their weight?- this will be more accurate than the formula you mention and correct for residue bias.

ADD REPLY • link 11.9 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

Thank you for the reply. I had considered this but it seems very simplistic. There appears to be some complicated algorithms out there that take into account all sorts of factors and data when calculating this and I was having trouble understanding the aspects one could consider to generate the most accurate result. I may be overcomplicating this however. I will see what results I get once my program has given me the data based on the method you suggested. Thank you for your time.

ADD REPLY • link 11.9 years ago by kajendiran56 ▴ 120

score 1 · Answer 1 · 2012-06-13

1

Entering edit mode

11.9 years ago

Khader Shameer 18k

You can use pepstats from EMBOSS package for this task. To automate you can have to use a steering file that pass commandline options to EMBOSS. Please refer to this discussion for more details.

ADD COMMENT • link 11.9 years ago by Khader Shameer 18k

score 1 · Answer 2 · 2012-06-13

It depends on how accurate you want to be - whether you want to include post-translational modifications, pH-labile hydrogens or disulphide bonds. But for a high-throughput method you can just add up the residue weights (110 is the average weight of a residue, not a free amino acid iirc) then add an extra 18 Da for the free termini. There's a utility in BioPython that will do this easily:

http://biopython.org/DIST/docs/api/Bio.SeqUtils.ProtParam-module.html

There are equivalents in other languages too.