Question

comparision of umap single cell

0

Entering edit mode

7 months ago

synat.keam ▴ 100

Dear Fellow,

I'm currently learning to analyze single cell RNAseq and compare my result with the analysis by bioinformatician. We analyzed the same data of 9 individual patients from 10X. His UMAP looks nice, but mine looks a bit messy and aggregated together. I could not access his code or have any contact with him. The following are my codes. I am not sure whether I did anything wrong or do I need to specify more parameter. I have followed the Seurat QC and filtered all low quality cells etc. I would say they are good for QC. Could you examine my codes and see whether you could suggest something to improve my UMAP. We both use Seurat pipeline and Harmony for integration. Looking forward to your suggestion.

Kind Regards,
Synat

#Merge Seurat Object!

All_arthritis <- merge(sampleA, y=c(sampleB, sampleC, sampleD, sampleE, sampleF, SampleG, SampleH))

#Normalize the data
All_arthritis_normal <- NormalizeData(All_arthritis) 
All_arthritis_normal <- FindVariableFeatures(All_arthritis_normal)

#Scale the data!
All_arthritis_normal_scale <- ScaleData(All_arthritis_normal) 

#Run PCA !
All_arthritis_normal_pca <- RunPCA(All_arthritis_normal_scale, verbose = FALSE)

#Plot PC components ! 
Elboplot<- ElbowPlot(All_arthritis_normal_pca, ndims = 50, reduction = "pca") #n= 30 is okay !
Elboplot

#Integration using harmony for umap
library(harmony)

## Harmony
All_arthritis_harm<- RunHarmony(All_arthritis_normal_pca, group.by.vars = "orig.ident")
All_arthritis_harm_umap <- RunUMAP(All_arthritis_harm, dims = 1:30)
All_arthritis_neigh <- FindNeighbors(All_arthritis_harm_umap, dims = 1:30)
All_arthritis_findclus <- FindClusters(All_arthritis_neigh, resolution = 0.1)

DimPlot(All_arthritis_findclus, reduction = "umap", group.by = "seurat_clusters")

enter image description here

single-cell • 559 views

ADD COMMENT • link 7 months ago by synat.keam ▴ 100

1

Entering edit mode

It's borderline impossible to reproduce single-cell results in terms of getting the exact same clusters and labels without exact same code, software versions, input data and random seeds (if even used).

I could not access his code or have any contact with him

Following up on my comment above, and considering that the legacy code of your peer is not available I recommend to start over freshly and use your output. Follow best practices such as Sezrat tutorials, https://bioconductor.org/books/release/OSCA/ or the ScanPy tutorials and go along with that.

ADD REPLY • link 7 months ago by ATpoint 82k

0

Entering edit mode

As ATpoint noted it will be very hard to reproduce the same result without the exact same R code and the same software. I would suggest you that you do not have to define a separate variable in each step (for e.g., you can perform all the steps you had done here in the same All_arthritis seurat object, in that way you will save a lot of memory. Additional resource to learn single cell RNAseq data analysis will be:- https://www.sc-best-practices.org/preamble.html

ADD REPLY • link 7 months ago by bk11 ★ 2.4k

0

Entering edit mode

Thank, Seniors.. I figured out the issue of my integration as I did not state the reduction= "harmony"

ADD REPLY • link 7 months ago by synat.keam ▴ 100