Protein_Change in MAF
0
0
Entering edit mode
11 months ago
im • 0

I was given MAF files (perhaps not standard, I'm not sure) and I was wondering what protocol/format/standard the "Protein_Change" column might be using if anyone recognizes it, as I cannot find any documentation on it. I originally thought it may correspond to the amino acid change (which I need), but some rows in these files have the same Codon_Change (minus the location) yet have different values in the Protein_Change column, so I am now confused. For example:

What I want to do is group the mutations by identical amino acid change, but I can't figure out a good way to do that.

MAF mutations amino acids • 451 views
0
Entering edit mode

The trinucleotide GCC can occur multiple times in a coding sequence - so "same Codon_Change (minus the location)" is meaningless as location is the key component. The Protein_Change seems to be following HGVS conventions (to an extent, as synonymous variants should ideally be notated by = like so: p.A22=), so they should be easy to handle.

0
Entering edit mode

For additional reference, the description of the HGVS format can be found on their website (https://varnomen.hgvs.org/ ) or in the paper (https://onlinelibrary.wiley.com/doi/full/10.1002/humu.22981 ). There is also python packages to parse the format (https://hgvs.readthedocs.io/en/stable/index.html ).