Question: Difference between tSNE and PCA analysis
9
gravatar for Qingyang Xiao
21 months ago by
Qingyang Xiao130
Stockholm
Qingyang Xiao130 wrote:

Hello! As clustering methods, what's the main difference between tSNE and PCA analysis?

rna-seq next-gen • 14k views
ADD COMMENTlink modified 21 months ago by Jeremy Leipzig18k • written 21 months ago by Qingyang Xiao130
1

Other than the math?

ADD REPLYlink written 21 months ago by Devon Ryan92k
1

Yes, other than math, mainly for biological applications

ADD REPLYlink written 21 months ago by Qingyang Xiao130
1

It is used for dimensionality reduction and now depending upon variables and your interest of inferencing the applications will be considered. PCA has been a pretty favorite tool till date for RNA-Seq , ChIP-Seq and also WES data, but with incoming scRNASeq and also large scale SNPs data scoring population genetic inferencing t-SNE is coming handy as well. The links I have already given below in the answer should suffice. Now I will post here one more w.r.t Human Genetic Data. Your question is too broad so probably you need to do some background study. Rest it all depends on the data you will be using and depending on that your methods for dimensionality reduction will be coming into consideration.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k
2

Nonsense - the question isn't too broad. If someone just asks for the "main difference" you should be able to explain it in a sentence or two instead of bombarding them with links.

ADD REPLYlink written 21 months ago by Jeremy Leipzig18k

I guess you have to check what the OP wrote in comments as biological application.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k

Many thanks to these fascinating answers!

ADD REPLYlink written 21 months ago by Qingyang Xiao130
8
gravatar for Jeremy Leipzig
21 months ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

The main difference between t-SNE (or other manifold learning methods) and PCA is that t-SNE tries to deconvolute relationships between neighbors in high-dimensional data.

A classic example is the "swiss roll". To put the difference in layman's terms: t-SNE attempts to understand the underlying structure of the swiss roll. It does this by prioritizing neighboring points. PCA doesn't get what's going on - it doesn't see that the points are actually a line that's been rolled up.

Original data:

enter image description here

This PCA sucks (it thinks yellow is close to blue when in fact they are far away):

http://yinsenm.github.io/figure/STAT545/PCASwiss.png

In contrast, see how t-SNE seems to understand what's going on with this 'S'? enter image description here

ADD COMMENTlink modified 21 months ago • written 21 months ago by Jeremy Leipzig18k
3
gravatar for ivivek_ngs
21 months ago by
ivivek_ngs4.8k
Seattle,WA, USA
ivivek_ngs4.8k wrote:

I can suggest some links that will give you the flavor of both the methods that are used in dimensionality reduction.

  1. Link1
  2. Link2
  3. If w.r.t scRNA-Seq check here
  4. For bulk RNASeq check here
  5. If you are a fan of kaggle this link is pretty fun as well for usage understanding.
ADD COMMENTlink modified 21 months ago • written 21 months ago by ivivek_ngs4.8k
1

Thanks- I think in your point 5 you forgot to actually put the hyperlink.

ADD REPLYlink written 21 months ago by dariober10k

updated. Thanks for pointing it out.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k
3
gravatar for dariober
21 months ago by
dariober10k
WCIP | Glasgow | UK
dariober10k wrote:

Just a couple of comments... Neither tSNE or PCA are clustering methods even if in practice you can use them to see if/how your data form clusters. tSNE works downstream to PCA since it first computes the first n principal components and then maps these n dimensions to a 2D space. The original paper on tSNE is relatively accessible and if I remember correctly it has some discussion on PCA vs tSNE. Also, this post on tSNE is quite good, although not really about tSNE vs PCA.

ADD COMMENTlink written 21 months ago by dariober10k

Nice one, that is the reason I never used the term clustering ;)

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1307 users visited in the last hour