scRNA-seq, Seurat: correlation analysis of two replicates
3
0
Entering edit mode
4.2 years ago
Gregor Rot ▴ 540

Hello,

i have two scRNA-seq datasets (replicate 1, replicate 2).

I would like to compare how many genes were detected, average expression of genes, in brief: an estimate of how similar (reproducible) the replicates are, in terms of how well they correlate. Any hints, ideas, points where i would find code snippets of what is the best approach to this? (I am learning Seurat but happy to check out other software, like Scanpy)

Currently i am trying to normalize the data and plot average gene expression rep1 vs rep2.

Thanks for any help, Gregor

scRNA Seurat • 5.5k views
ADD COMMENT
2
Entering edit mode
4.1 years ago

I believe you're describing a couple of different features that you'd like to compare in addition to plain ol' correlation. The numbers of detected reads etc. can be assessed via in-built Seurat functions, such as shown in the basic processing vignette (also see ther reply to "Where are QC metrics stored in Seurat?").

I personally find it relatively tedious to interact with Seurat objects, which is why I would highly recommend to peruse the excellent guide to the scRNA-seq galaxy by the Bioconductor team. Their SingleCellExperiment object follows the well-established logic of the SummarizedExperiment object class, which makes it fairly straight-forward to extract the QC metrics one is interested in to make customized plots.

ADD COMMENT
1
Entering edit mode
4.1 years ago
piyushjo ▴ 700

This doesn't exactly follow your appraoch to answers your question, however, you can check how similar the two replicates are by doing a merge analysis.

https://davetang.org/muse/2018/01/24/merging-two-10x-single-cell-datasets/

In this way, you are not performing any batch correction, just checking if two replicates are similar. In the example in the tutorial, both PBMC4k and 8k samples are very close, even they were (I am assuming) two different sequence experiments.

Even if there is some depth issue, or difference in number of cells, this will give you an idea if two replicates are actually replicates.

If your two samples suffers from batch effect, if they are cell lines there is less chance of that unless something wrong, you might not seem them overlapping.

ADD COMMENT
0
Entering edit mode

I second this method - it is by far the easiest way. You can see how to do so in a single line in the Seurat cheat sheet. After doing so, you can just plot after performing dimensionality reduction by PCA/TSNE/UMAP and color by replicate to assess overlap.

ADD REPLY
0
Entering edit mode
4.1 years ago

Hi Gregor, apologies for the delay,

I do not believe there is anything built into Seurat for this. You could try an 'old friend' from the car package, though, i.e., scatterplotMatrix(). It performs a pairwise correlation / regression between all columns in your input data:

require(car)
scatterplotMatrix(x,
    regLine = list(method = lm, lty = 1, lwd = 2, col = "red2"),
    diagonal = "density",
    pch = '.',
    col = 'black',
    ellipse = TRUE, levels = c(0.5, 0.95), robust=TRUE)

In this example, based on the parameters that I have chosen, a linear regression is fit to the data (red line), with the lower and upper 5% confidence intervals (dashed black lines).

Warning: don't run this on a data-matrix of many samples with 1000s of genes - it will crash your computer.

u

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2863 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6