Question

NMDS: how many axes to plot?

0

Entering edit mode

14 months ago

LaFra ▴ 10

Hi all,

I have a question on NMDS. I runned an NMDS on my data and it extracts 4 axes. I now have to plot my results, but I don't know if I should plot all axes combination or just the first 2. Is it correct to say the the first two axes best explain my data, so that I can just plot the first two? Or all the extracted axes have the same importance?

Thank you!

NMDS • 1.3k views

ADD COMMENT • link 14 months ago by LaFra ▴ 10

score 0 · Answer 1 · 2023-02-01

0

Entering edit mode

14 months ago

Mensur Dlakic ★ 27k

MDS and nMDS are not an eigenvalue-eigenvector technique like PCA. This means that dimension 1 doesn't necessarily explain the most variance, dimension 2 doesn't necessarily explain the next greatest amount of variance, and so on. MDS also is a numerical optimization technique, and it isn't guaranteed to find a global minimum.

With these caveats, I wouldn't go with 4 dimensions and try to figure out which 2 or 3 of them to plot. Instead, I would go with 2 components, plot and inspect, and then increase the number of components until stress values are not getting any lower. If it so happens that 2-3 dimensions is enough, you don't have to worry what to plot as either would work. If stress values keep decreasing beyond the 3rd component, it still may be enough to plot the results of a 2- or 3-component analysis if data separation is satisfactory. While lowest stress value is the ultimate criterion on when to stop adding components, the visual representation can be used as well.

I talked here primarily about nMDS because that is what you asked about. However, it is very likely that t-SNE, UMAP and possibly PCA, with either 2 or 3 components, will give you better visual representation than nMDS.

ADD COMMENT • link 14 months ago by Mensur Dlakic ★ 27k

0

Entering edit mode

Thank you very much for your answer. The thing is that I increased the number of component until I got an acceptable stress value and with 3 component is above 0.2, with 4 component is about 0.16. That's why I have to consider 4 component. So, if I understood weel, do you suggest to plot just the first 2 component?

ADD REPLY • link 14 months ago by LaFra ▴ 10

0

Entering edit mode

So, if I understood weel, do you suggest to plot just the first 2 component?

That is not what I am suggesting. As I said above, there is no order to components, and the first two may or may not be most informative.

What I suggest is to run nMDS only with two components, even if that doesn't produce the lowest stress value, and make a plot of those two components. Then do nMDS with 3 components, and plot them in 3D. If either or both plots are already informative and data points are separated well, you can stop there and use those plots, even with the knowledge that going for 4 components would have given you a lower stress score.

It is quite common that we plot only the first 2 or 3 PCA components, because that is how much we can plot at one time. Even when the first 2-3 components explain only 30-40% of total variance - that would be the nMDS equivalent of not having the lowest stress value - that is often enough to clearly see groups of data points.

The only difference is that with PCA we know that components are ordered by descending ability to explain variance, so the first 2-3 components will always explain more variance than any other random 2-3 components. That's why with PCA we can calculate 20 components but only plot the first 2-3, while with nMDS we can't calculate 20 components and rely on order of components to decide which 2-3 of them to plot.