Question: Mutation clustering in protein domains
gravatar for Lio04
9 months ago by
Lio040 wrote:

Hi everyone,

Via targeted resequencing, I have identified missense mutations in my gene of interest. These missense mutations are enriched in patients versus control individuals. I made a lolliop plot to visualise where these mutations are located on the protein and to see whether they occur in protein domains. I now want to know wheter certain protein domains are enriched for these mutations, or if these mutations tend to cluster in any other region of the protein.

What is the correct way to achieve this? Are there any papers or tools that I can look into?

Thanks a lot for your suggestions.

ADD COMMENTlink modified 9 months ago by Jean-Karim Heriche24k • written 9 months ago by Lio040
gravatar for Mensur Dlakic
9 months ago by
Mensur Dlakic8.1k
Mensur Dlakic8.1k wrote:

I am not sure whether this is just the way you are describing it, or if you understanding of protein domains is different from mine. In proteins, there is no division between "domains" and "other regions of the protein." Sure, there may be short linkers or unfolded parts in some proteins, but for practical purposes all proteins have domain organization.

If you submit your protein to Pfam, it will generate its domain organization. A domain organization of a random protein is shown here, and you can easily check whether your mutations fall into the same domain boundary.

What may be more informative is to check whether your mutations cluster together spatially, even if they are some distance apart in the same domain or even in different domains. For that you would need a 3D structure (or 3D model) of your protein, and I don't have enough information if something like that is available.

ADD COMMENTlink written 9 months ago by Mensur Dlakic8.1k

Yes, I already used Pfam to retrieve the domain organization of my protein of interest and mapped the mutations onto my protein, so I know where they are located. My question exactly is how to bioinformatically/statistically test clustering to present the results in a more scientific way instead of by visual confirmation.

I could find a 3D model via Swiss-Model Repository. Can I use a tool or test other than to visually check clustering?

Thank you very much!

ADD REPLYlink written 9 months ago by Lio040

If you have a structure (or a model) of your protein, coloring mutated residues differently from the rest should help you visualize whether they are in close spatial proximity. PyMol can do that easily - an extensive tutorial is here, and you will specifically need Selection commands. If you want to quantify this beyond visualization, PyMol also can measure distances between residues. You can safely assume that residues closer than 8-10 angstroms in space are part of the same "patch" within a molecule, and it may be appropriate to use even larger distance.

ADD REPLYlink written 9 months ago by Mensur Dlakic8.1k

I will look into it, thank you very much!

ADD REPLYlink written 9 months ago by Lio040

A word of caution about visual confirmation is that mutations can preferentially occur at certain nucleotide sequence contexts (e.g. CpG) which are not necessarily evenly distributed across the CDS of a protein.

ADD REPLYlink written 9 months ago by Collin860
gravatar for Jean-Karim Heriche
9 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

This paper titled "Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase" may give you some ideas on how to test for clustering along a sequence.

ADD COMMENTlink written 9 months ago by Jean-Karim Heriche24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1746 users visited in the last hour