Question

Criteria for performing PCA or ICA

1

Entering edit mode

2.1 years ago

hamid sta ▴ 10

Hello, I'm currently analyzing scRNA-seq data. After quality control of my cells, i have done standard normalization / scaling / find variable features , with the tools sctransform developed by satija lab . However for the dimensionality reduction part, i have started with the pca. However the clustering was not clear. And Based on well knowns markers genes, it appear that, different cell types ( in my case it's the "x zone" and "fasciculata " ) are not clustered separately, which is not what I expected . And upping the resolution, of the clustering appear to just create other "sub" cluster, who are just to similar for annotate it as different cell types .

However, after performing an ICA instead of a PCA, it appear that my two previous cell types " x zone" and "fasciculata " are well separated in two different cluster, which is perfect, and all my markers genes match perfectly with my clustering .

So it appear that ica are more suited for my dataset. That's why i wonder if there is any statistical criteria/evidence that can help me understand why ICA is better than PCA in my case ?

Also the proportion of variance explained is only 13% with 50 PCs and 11% for 20 PCs. Which is really low . Maybe it's related to this ?

thanks you

ICA scRNA PCA • 580 views

ADD COMMENT • link 2.1 years ago by hamid sta ▴ 10

score 1 · Answer 1 · 2022-03-14

Dimensionality reduction methods do exactly what their name implies. The first PCA component is the direction that best explains the data variability, and each subsequent PC is the next best. In PCA there is a ranking of components, and we know to include early PCs before late ones. When you get only 13% variability explained with 50 PCs, either there is no structure in your data, or it is very noisy.

ICA is more of a data separation rather than data compression technique. Its components are not ranked and are required collectively to separate the data. You may want to do PCA and feed its first 50-100 PCs into ICA. Although ICA is adept at extracting signals from noisy data, it may help to let PCA take a first crack at removing the noise.

This thread may not be a direct answer to your question, but it will hopefully give you some food for thought.