I think batch correction didn't working in my dataset(ft. RunHarmony)
2
0
Entering edit mode
14 months ago
kayah ▴ 20

Hi all, now I'm analyzing scRNA seq. The problem is that I don't think batch correction work well. I want to see the aging difference between young and old group so I want to do batch correction in sex. but, I don;t think it work well. (I don't have any repeats in my dataset) Thank you so muuch!!

WAT <- merge(WAT_M_Y, y = c(WAT_M_O, WAT_F_Y, WAT_F_O), 
                 project = "WAT")
    WAT@meta.data$type <- c(rep("Young", ncol(WAT_M_Y)),
                            rep("Old", ncol(WAT_M_O)),
                            rep("Young", ncol(WAT_F_Y)),
                            rep("Old", ncol(WAT_F_O)))
    WAT@meta.data$sex <- c(rep("Male", ncol(WAT_M_Y)),
                           rep("Male", ncol(WAT_M_O)),
                           rep("Female", ncol(WAT_F_Y)),
                           rep("Female", ncol(WAT_F_O)))

    WAT$lowQC <- ifelse(WAT$nFeature_RNA>500 & 
                          WAT$percent.mt <10, "PASS", "FAIL")
    WAT<- subset(WAT, subset = lowQC == "PASS")
    WAT <- NormalizeData(WAT)
    WAT_variable  <- FindVariableFeatures(WAT, selection.method = "vst", nfeatures = 2000) 
    WAT_variable <- ScaleData(WAT_variable, vars.to.regress = c("percent.mt"))
    VariableFeaturePlot(object= WAT_variable)
    WAT_variable <- RunPCA(WAT_variable, features = VariableFeatures(WAT_variable), verbose = T) 
    #batch correction 
    library(harmony)
    library(Rcpp)
    WAT_variable2 <- RunHarmony(WAT_variable, group.by.vars = "sex")

enter image description here

scRNAseq • 1.4k views
ADD COMMENT
1
Entering edit mode
14 months ago
OmnibusX ▴ 100

They do look aligned together. The difference might come from the difference in cell type composition between samples. One quick method I often use to assess batch correction efficiency is coloring your cell clusters by cell type labels and checking whether the same cell types are clustered together. If they are, it suggests that the batch correction has been effective in aligning the data appropriately across your batches. This visualization helps confirm that the batch effect isn't dominating the biological signal, which is crucial for your analysis of age-related differences. If you continue to see the same cell types split into distinct clusters, it may indicate that the batch correction hasn't fully addressed the underlying batch effects.

enter image description here

ADD COMMENT
1
Entering edit mode

enter image description here Based on the plots generated using your guidance, it seems that batch correction might not be necessary, or it has been effectively applied. Thank you for your kind and detailed response.

ADD REPLY
1
Entering edit mode
14 months ago
BioinfGuru ★ 2.1k

Using RLE plots alongside PCA plots is a more helpful assessment of the presence/removal of technical variation.

Great youtube explanation

Paper

Workflow

ADD COMMENT

Login before adding your answer.

Traffic: 2654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6