Question: Best Clustering Algorithms for Mutation Data?
0
gravatar for blazer9131
5 days ago by
blazer913110
United States
blazer913110 wrote:

Hey ya'll.

I have a project with about 50-60 different samples with exome sequencing data. I have genotyped these samples and there are ~150 genes which have different levels of mutation ranging from missense, nonsense, indels, amplification, deletion, etc. I tiered them in terms of biological significance such that a 3 is significant impact, 2 has an impact, and 1 would be little impact. A sample w/o mutations at that gene had a 0.

I imported this into R and a df and tried to do classic clustering using hclust and made a few heatmaps/dendrograms. I used Ward.D2 for my analysis, but I'm not very skilled in statistics. I'm not sure if there would be a better algorithm for this dataset. Would anyone know a better method/algorithm? I'm trying to classify/group these samples using the exonic data I have.

R • 98 views
ADD COMMENTlink written 5 days ago by blazer913110

Please include sample data and What do you expect the result to be and what was you result when you launch your analysis? With that answer we can improve your analysis.

ADD REPLYlink written 5 days ago by anicet.ebou30

Clustering is about grouping items by similarity/proximity. You need to define what similarity/proximity is relevant in your case, i.e. what should items in the same cluster share that would differentiate them from another cluster. This helps in selecting the similarity measure used for clustering. Then the selection of clustering algorithm can be dependent on some knowledge/assumption about the cluster structure.

ADD REPLYlink written 4 days ago by Jean-Karim Heriche13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1424 users visited in the last hour