Hi, I have data from a specific cell from mouse fed with a certain diet. I have 4 datasets in total and they were measured at four different time. My goal is to integrate the four datasets and do integrated analysis for single cell RNA seq. I have been using Seurat and referring to the vignette : https://satijalab.org/seurat/v3.1/immune_alignment.html
This is my first time to do integrated analysis for scRNA seq. I first did a clustering analysis with one out of the four datasets (around 3,000 cells after pre-processing) before integrating all of 4 datasets. For the specific one dataset, I included all 40 significant PCs to do UMAP clustering.
After that, I tried integrating 2 out of the 4 datasets (around 4,700 cells after integrating) just to see if how Seurat would work, and when I did
JackStrawPlot(two.combined, dims = 1:50)
It shows, all 50 PCs (or even more than 50) are significant.
From this, I guess that if I get to integrate all 4 datasets, the number of significant PC would be very likely to be different from the observed values so far (40 and 50).
Then, in this case,
- Should I just ignore the previous PC values and just integrate all of 4 datasets and do
JackStrawPlotto determine how many PCs should be included for the integrated analysis?
- Is there any relationship between PC and CC value that is used in
FindIntegrationAnchors? For example, the dimension in
FindIntegrationAnchorsshould be greater than or equal to PC value in
RUNUMAP? If not, is there any specific way to determine an ideal CC value like Jackstraw plot or Elbow plot that we use to try to find an ideal PC value?