Question: scRNA-seq, Seurat: correlation analysis of two replicates
gravatar for Gregor Rot
8 months ago by
Gregor Rot450
Zurich, Switzerland
Gregor Rot450 wrote:


i have two scRNA-seq datasets (replicate 1, replicate 2).

I would like to compare how many genes were detected, average expression of genes, in brief: an estimate of how similar (reproducible) the replicates are, in terms of how well they correlate. Any hints, ideas, points where i would find code snippets of what is the best approach to this? (I am learning Seurat but happy to check out other software, like Scanpy)

Currently i am trying to normalize the data and plot average gene expression rep1 vs rep2.

Thanks for any help, Gregor

seurat scrna • 882 views
ADD COMMENTlink modified 6 months ago by Friederike6.3k • written 8 months ago by Gregor Rot450
gravatar for piyushjo
6 months ago by
piyushjo550 wrote:

This doesn't exactly follow your appraoch to answers your question, however, you can check how similar the two replicates are by doing a merge analysis.

In this way, you are not performing any batch correction, just checking if two replicates are similar. In the example in the tutorial, both PBMC4k and 8k samples are very close, even they were (I am assuming) two different sequence experiments.

Even if there is some depth issue, or difference in number of cells, this will give you an idea if two replicates are actually replicates.

If your two samples suffers from batch effect, if they are cell lines there is less chance of that unless something wrong, you might not seem them overlapping.

ADD COMMENTlink written 6 months ago by piyushjo550

I second this method - it is by far the easiest way. You can see how to do so in a single line in the Seurat cheat sheet. After doing so, you can just plot after performing dimensionality reduction by PCA/TSNE/UMAP and color by replicate to assess overlap.

ADD REPLYlink modified 6 months ago • written 6 months ago by jared.andrews077.5k
gravatar for Kevin Blighe
6 months ago by
Kevin Blighe66k
Kevin Blighe66k wrote:

Hi Gregor, apologies for the delay,

I do not believe there is anything built into Seurat for this. You could try an 'old friend' from the car package, though, i.e., scatterplotMatrix(). It performs a pairwise correlation / regression between all columns in your input data:

    regLine = list(method = lm, lty = 1, lwd = 2, col = "red2"),
    diagonal = "density",
    pch = '.',
    col = 'black',
    ellipse = TRUE, levels = c(0.5, 0.95), robust=TRUE)

In this example, based on the parameters that I have chosen, a linear regression is fit to the data (red line), with the lower and upper 5% confidence intervals (dashed black lines).

Warning: don't run this on a data-matrix of many samples with 1000s of genes - it will crash your computer.



ADD COMMENTlink modified 6 months ago • written 6 months ago by Kevin Blighe66k
gravatar for Friederike
6 months ago by
United States
Friederike6.3k wrote:

I believe you're describing a couple of different features that you'd like to compare in addition to plain ol' correlation. The numbers of detected reads etc. can be assessed via in-built Seurat functions, such as shown in the basic processing vignette (also see ther reply to "Where are QC metrics stored in Seurat?").

I personally find it relatively tedious to interact with Seurat objects, which is why I would highly recommend to peruse the excellent guide to the scRNA-seq galaxy by the Bioconductor team. Their SingleCellExperiment object follows the well-established logic of the SummarizedExperiment object class, which makes it fairly straight-forward to extract the QC metrics one is interested in to make customized plots.

ADD COMMENTlink written 6 months ago by Friederike6.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1562 users visited in the last hour