Seurat dims and QC questions
1
0
Entering edit mode
13 days ago
reddyornah • 0

I am a beginner at this and genuinely am getting so confused. I have been using elbow plots to guess what I should do for my DIMS and it looks like 30 makes sense but just in case, I ran my data with 50 too and the cell predictions are drastically different. I kept the dims all the same including FindTransferAnchors and TransferData. Why is my data coming out so different?

dims • 555 views
ADD COMMENT
0
Entering edit mode

A pictures can be worth a thousand words. Consider posting plots (you can remove any sensitive info) to get useful suggestions/answers.

ADD REPLY
0
Entering edit mode

well i'm not generating any plots. i run the data to find predicted cells then make a table of what cells are predicted and the counts. the thing is that the counts are drastically different when I change the dims and it makes me super unsure of what they should actually be because i would assume there is an upper and lower limit.

ADD REPLY
0
Entering edit mode

its a very thin layer explanation, just how your plot or generate the plot so that people can see and view and give their expert advice.

ADD REPLY
0
Entering edit mode

I'm not quite understanding what you've done so far, what you're trying to achieve, or how you are measuring success. But to address the question of how many dimensions from your PCA to use, a good place to start is by running RunUMAP() with a few different dimensions (as you've done). Pair this with clustering using a range of resolutions and see if your UMAP and cluster assignments broadly align with each other. Look for biological relevance in the clusters by running FindAllMarkers(). I would likely get a feel for the data this way before trying to transfer labels from another dataset - you'll then have some knowledge of the data to sanity check the cell type labels.

ADD REPLY
2
Entering edit mode
13 days ago

The npcs in FindTransferAnchors correspond to the principal componants summarizing the reference dataset. If you use a number of PCs under the optimal segregation of the cell types, you might lose some cell types. If you use a number of PCs over the optimal segregation of the cell types you will add technical artifact/noise to the model which could end up in wrongly annotating your cells.

The variation of dims in TransferData is more or less the same explanation as above but for the selection of the anchors between your reference dataset and your query, if you use too many PCs, some anchors will be specific to technical variations between your datasets, which could lead to miss annotate.

I invite you to have a deep look at the genes driving your anchors before tranferring, they should characterize all of your cell types.

ADD COMMENT

Login before adding your answer.

Traffic: 2385 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6