Question: Mutation clustering in protein domains
gravatar for Lio04
9 weeks ago by
Lio040 wrote:

Hi everyone,

Via targeted resequencing, I have identified missense mutations in my gene of interest. These missense mutations are enriched in patients versus control individuals. I made a lolliop plot to visualise where these mutations are located on the protein and to see whether they occur in protein domains. I now want to know wheter certain protein domains are enriched for these mutations, or if these mutations tend to cluster in any other region of the protein.

What is the correct way to achieve this? Are there any papers or tools that I can look into?

Thanks a lot for your suggestions.

ADD COMMENTlink modified 9 weeks ago by Jean-Karim Heriche22k • written 9 weeks ago by Lio040
gravatar for Mensur Dlakic
9 weeks ago by
Mensur Dlakic5.5k
Mensur Dlakic5.5k wrote:

I am not sure whether this is just the way you are describing it, or if you understanding of protein domains is different from mine. In proteins, there is no division between "domains" and "other regions of the protein." Sure, there may be short linkers or unfolded parts in some proteins, but for practical purposes all proteins have domain organization.

If you submit your protein to Pfam, it will generate its domain organization. A domain organization of a random protein is shown here, and you can easily check whether your mutations fall into the same domain boundary.

What may be more informative is to check whether your mutations cluster together spatially, even if they are some distance apart in the same domain or even in different domains. For that you would need a 3D structure (or 3D model) of your protein, and I don't have enough information if something like that is available.

ADD COMMENTlink written 9 weeks ago by Mensur Dlakic5.5k

Yes, I already used Pfam to retrieve the domain organization of my protein of interest and mapped the mutations onto my protein, so I know where they are located. My question exactly is how to bioinformatically/statistically test clustering to present the results in a more scientific way instead of by visual confirmation.

I could find a 3D model via Swiss-Model Repository. Can I use a tool or test other than to visually check clustering?

Thank you very much!

ADD REPLYlink written 9 weeks ago by Lio040

If you have a structure (or a model) of your protein, coloring mutated residues differently from the rest should help you visualize whether they are in close spatial proximity. PyMol can do that easily - an extensive tutorial is here, and you will specifically need Selection commands. If you want to quantify this beyond visualization, PyMol also can measure distances between residues. You can safely assume that residues closer than 8-10 angstroms in space are part of the same "patch" within a molecule, and it may be appropriate to use even larger distance.

ADD REPLYlink written 9 weeks ago by Mensur Dlakic5.5k

I will look into it, thank you very much!

ADD REPLYlink written 9 weeks ago by Lio040

A word of caution about visual confirmation is that mutations can preferentially occur at certain nucleotide sequence contexts (e.g. CpG) which are not necessarily evenly distributed across the CDS of a protein.

ADD REPLYlink written 9 weeks ago by Collin820
gravatar for Jean-Karim Heriche
9 weeks ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche22k wrote:

This paper titled "Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase" may give you some ideas on how to test for clustering along a sequence.

ADD COMMENTlink written 9 weeks ago by Jean-Karim Heriche22k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2073 users visited in the last hour