Question: Why tSNE and UMAP give ill-defined and unclear clusters result?
gravatar for wayj86
7 months ago by
wayj8610 wrote:


I am using Seurat 3.1 to integrate my 11 samples (2 Knock-out, 3 wild type 3 Knock-in and 3 Overexpression) via standard workflow. The results of UMAP seemed ill-defined and unclear:

enter image description here

In another attempt, I also tried tSNE, but the result also look weird:

enter image description here

I tried to set different dims and the result didn't improve. So could you please tell me how improve the result of tSNE and UMAP when using Seurat 3.1? Thanks a million in advance.


rna-seq • 515 views
ADD COMMENTlink modified 7 months ago • written 7 months ago by wayj8610

Not enough information to really answer. How are you integrating? Using Seurat's method or one of the wrappers? How many cell types would you expect based on your sample? Are you doing any differentiation? If so, the oddities in the UMAP structure would make more sense.

ADD REPLYlink written 7 months ago by jared.andrews076.4k

Thanks a lot for your answer. I expect 6-7 major cell types. The script I used to integrate my data was listed below:

 options(stringsAsFactors = FALSE)
    #For KIHO66 <- Read10X(data.dir = "./")
    KIHO66 <- CreateSeuratObject(counts =, min.cells = 3, min.features = 200, project = "KIHO66_geneX") #23117 features across 7691 samples within 1 assay
    KIHO66 <- RenameCells(KIHO66, = "KIHO66")
    KIHO66[[""]] <- PercentageFeatureSet(KIHO66, pattern = "^mt-")
    VlnPlot(KIHO66, features = c("nFeature_RNA", "nCount_RNA", ""), ncol = 3)
    plot1 <- FeatureScatter(KIHO66, feature1 = "nCount_RNA", feature2 = "")
    plot2 <- FeatureScatter(KIHO66, feature1 = "nCount_RNA", feature2 = "nFeature_RNA")
    CombinePlots(plots = list(plot1, plot2))
    KIHO66 <- subset(KIHO66, subset = nFeature_RNA > 200 & nFeature_RNA < 9000 & < 5) #23117 features across 6087 samples within 1 assay 
    KIHO66 <- NormalizeData(KIHO66, normalization.method = "LogNormalize", scale.factor = 10000)
    KIHO66 <- FindVariableFeatures(KIHO66, selection.method = "vst", nfeatures = 2000)$sample <- "KIHO66"$treatment <- "KI"

..And another 10 samples...
reference.list <- list(KIHO66, KIHO82, KIHO96, KOHO37, KOHO41, WT23, WT26, WT84)
geneX.anchors <- FindIntegrationAnchors(object.list = reference.list, dims = 1:30)
geneX.integrated <- IntegrateData(anchorset = geneX.anchors, dims = 1:30)
DefaultAssay(geneX.integrated) <- "integrated"
geneX.integrated <- ScaleData(geneX.integrated, verbose = FALSE)
geneX.integrated <- RunPCA(geneX.integrated, npcs = 30)
geneX.integrated <- RunTSNE(object = geneX.integrated, dims.use = 1:30, = TRUE)
geneX.integrated <- FindNeighbors(geneX.integrated, reduction = "pca", dims = 1:30)
geneX.integrated <- FindClusters(geneX.integrated, resolution = 0.4)
p1 <- DimPlot(geneX.integrated, reduction = "tsne", = "treatment")
p2 <- DimPlot(geneX.integrated, reduction = "tsne", label = TRUE)
plot_grid(p1, p2)
DimPlot(geneX.integrated, reduction = "tsne", = "treatment", label = TRUE)
ADD REPLYlink written 7 months ago by wayj8610
gravatar for Jean-Karim Heriche
7 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche23k wrote:

Without access to the data, we can't do much to help. With these algorithms, setting the correct parameters is important. For t-SNE, the main one is perplexity, for UMAP, the main parameters are the number of neighbours and to a lesser degree the minimum distance. I suggest you become familiar with the algorithms and their parameters. This page on how to use t-SNE effectively can maybe help you. For UMAP, try this page on understanding UMAP which also compares it with t-SNE. Also read the UMAP documentation.

ADD COMMENTlink written 7 months ago by Jean-Karim Heriche23k

Thank you for your nice advice.

ADD REPLYlink written 7 months ago by wayj8610
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour