Visualizing model-based clustering with fviz_mclust
1
0
Entering edit mode
3.4 years ago
lessismore ★ 1.2k

Dear all,

i have data matrix with samples on rows and genes on columns. i am using fviz_mclust function in factoextra package as explained here > http://www.sthda.com/english/articles/30-advanced-clustering/104-model-based-clustering-essentials/
When plotting the clusters with this function (Visualizing model-based clustering):

fviz_mclust(mc, "classification", geom = "point",
pointsize = 1.5, palette = "jco")


i would like to use the repel=TRUE in order to show the sample names but is not possible apparently. Could someone tell me why or help me with this?

factoextra clustering • 1.7k views
1
Entering edit mode

The repel parameter is available for the fviz_cluster() function. To use it for fviz_mclust(), you may have to implement repel functionality via geom_text_repel() or geom_label_repel()

For example:

fviz_mclust(mc, "classification", geom = "point", pointsize = 1.5, palette = "jco", ggtheme=geom_text_repel())


That is untested, though.

0
Entering edit mode

Thanks kevin, could you please tell me or point me to a link where it's showed how to implement it? ill try to remember it for the rest of my life!

0
Entering edit mode

2
Entering edit mode
3.4 years ago

Any parameter that is used with fviz_cluster() can also be used with fviz_mclust(); moreover, you can plot both types of geoms, i.e., text and points:

require("mclust")
data("diabetes")
mc <- Mclust(diabetes[, -1])

fviz_mclust(mc, "classification", geom=c("text","point"), pointsize = 1.5, palette = "jco", repel=FALSE)


fviz_mclust(mc, "classification", geom=c("text","point"), pointsize = 1.5, palette = "jco", repel=TRUE)


0
Entering edit mode

Thanks Kevin.
Given your expertise in this could you tell me if you know why some objects of the same cluster are not in the same circle? What does that mean? ive seen that they might have a lower probability to belong to the cluster and that's ok, but why the circle doesnt include them?

1
Entering edit mode

The ellipse functionality in fviz_mclust is merely a wrapper for the stat_ellipse function of ggplot2. This draws an ellipse around the objects in order to show which objects are more likely to be members of a particular cluster. So, objects falling outside of the ellipse are less confident members of the cluster, at the confidence level that you choose. This will obviously occur more in the following situations:

• the objects are more spread and there is high covariance
• the objects exhibit heteroskedasticity (unequal variance in different parts of the cluster)

You can control the ellipse via the ellipse.type and ellipse.level parameters. The defaults appear to be 'norm' and '0.4', respectively, which means that your ellipses will be drawn assuming a normal distribution in the data and at the 60% ([1.0 - 0.4] * 100) confidence level.