Calculate Mw In Kda From Protein Sequence
2
0
Entering edit mode
9.4 years ago
kajendiran56 ▴ 120

thank you for taking the time to look at my post. I was wondering if there is anyone who could help me to calculate the molecular weight of a protein from the protein sequence, ideally manually. I heard that for a rough estimate the following can be used:

No. of amino acids * 110 (i.e. the average mol wt of an amino acid) - No of amino acids * 18 (i.e mol wt of water)

I am also aware that there are numerous web servers available to do this, however, I have many hundreds of thousands of sequences and I want to incorporate this into my programs which performs other calculations.

If you cannot help with this, could you suggest an accurate stand alone program that I can use via a commandline call to do this?

Many thanks for your time.

protein sequence molecular • 14k views
ADD COMMENT
1
Entering edit mode

This problem should be simple enough to solve it yourself, given that you know some very basic programming. How about counting each residue type and multiplying by their weight?- this will be more accurate than the formula you mention and correct for residue bias.

ADD REPLY
0
Entering edit mode

Thank you for the reply. I had considered this but it seems very simplistic. There appears to be some complicated algorithms out there that take into account all sorts of factors and data when calculating this and I was having trouble understanding the aspects one could consider to generate the most accurate result. I may be overcomplicating this however. I will see what results I get once my program has given me the data based on the method you suggested. Thank you for your time.

ADD REPLY
1
Entering edit mode
9.4 years ago

You can use pepstats from EMBOSS package for this task. To automate you can have to use a steering file that pass commandline options to EMBOSS. Please refer to this discussion for more details.

ADD COMMENT
1
Entering edit mode
9.3 years ago
Gareth Morgan ▴ 310

It depends on how accurate you want to be - whether you want to include post-translational modifications, pH-labile hydrogens or disulphide bonds. But for a high-throughput method you can just add up the residue weights (110 is the average weight of a residue, not a free amino acid iirc) then add an extra 18 Da for the free termini. There's a utility in BioPython that will do this easily:

http://biopython.org/DIST/docs/api/Bio.SeqUtils.ProtParam-module.html

There are equivalents in other languages too.

ADD COMMENT

Login before adding your answer.

Traffic: 1428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6