UMAP or TSNE for single cells?
1
0
Entering edit mode
5 months ago
synat.keam ▴ 100

Dear all,

I am just slowly getting into single cells. I have been working with my data for 2.5 months now and tried to learn all the tips about single cell good practices.

I personally find UMAP is tricky especially when we are trying to fine-tune to get a good looking UMAP. I find TNSE is easier to produce consistent 2D visualization. Just would like to ask for some opinion whether you guys think TSNE is now popular for cluster visualization compared with UMAP. I am thinking of going forward with TSNE. Hope you can give some opinion.

Thanks,
Synat

single-cell • 1.1k views
ADD COMMENT
3
Entering edit mode
ADD REPLY
2
Entering edit mode

They talk in that paper as if it is unclear whether t-SNE and UMAP preserve global distances. It is well known that they don't - only local distances are preserved. Global distances MAY be preserved, but I have seen simple embeddings (< 1000 data points) where they were not.

More food for thought:

https://www.nature.com/articles/s41587-020-00809-z

ADD REPLY
2
Entering edit mode

The twitter thread associated with that paper says "they don't preserve local or global structure & are misleading".

Btw, there are some cool discussions about UMAPs today on twitter by multiple computational biology professors (give them all a follow!).

ADD REPLY
3
Entering edit mode
5 months ago
jamesxli2007 ▴ 40

The t-SNE maps do enable some interesting analysis. Like the size, density, shape and relative positions of clusters all tell something about the underlying data. For instance, a cluster shaped like a queue is most likely dominated by a single feature/gene. You just need to play with the method a little more.

ADD COMMENT
1
Entering edit mode

Hard disagree. One should not go around making biological conclusions based on the fact that you see a "queue" shape. Different computational biologists diverge on whether such techniques are useful as visualization, but interpreting them in that way (i.e. a downstream analysis in the embedding space) is not the way to go. Many computational biologists, at the very least, caution against such interpretations.

ADD REPLY
0
Entering edit mode

Comparing t-SNE and UMAP, our experience is similar to what you have said: the latter is way too instable and it produces too many fake clusters. Unfortunately, the latter has been widely promoted on the net by so-called computational biologists.

ADD REPLY

Login before adding your answer.

Traffic: 1841 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6