What does that dip in the density plot mean from a single cell data analysis?
1
0
Entering edit mode
2.8 years ago

For basic QC, I generated this density plot of UMI count per cell. The colors represent two conditions. The black vertical line is the cutoff point at nUMI = 200 for low read counts.

My question is, that small dip at the lower left corner of the plot means around 15K-20K cells have a unique density of counts than the majority of the cells. What could it mean? Does it mean doublets? Or something else? I would really appreciate your comments on this.

enter image description here

seurat scRNAseq • 1.4k views
ADD COMMENT
6
Entering edit mode
2.8 years ago

To understand these plots, I suggest you think of the individual columns of your matrix not as cells but as barcodes. And those QC plots are precisely aimed at helping you identify barcodes that are most likely representative of droplets that managed to capture one (and only one, no more, no fewer) cell _and_ that also managed to capture that cell in a relatively healthy state so that the transcriptome is as fully represented as technically possible.

There are various scenarios that can play out for any given droplet:

Number of cells in droplet Consequence for the UMI detection
a droplet may have captured 0 cells (this will, in fact, be the majority (>90%) of the droplets in conventional 10X Chromium applications.) Those barcodes should have very low numbers of UMIs associated with them that primarily represent ambient RNA contamination (e.g. from burst cells)
optimum: 1 intact cell was captured The number of UMI and transcripts should be a function of the overall abundance of transcripts in the cell. These are the ones you want!
1 dying cell was captured. Apoptosis generally leads to membrane permeabilization and mRNA degradation so that cytoplasmic mRNA will be largely lost here whereas the majority of the remaining transcripts that survived the droplet-based lysis and cDNA synthesis will be those that are particularly resistant to stress, such as those found in the numerous mitochondria where they are additionally protected by another layer of membranes.
Multiple cells are also sometimes captured, e.g. when the cells weren't dissociated sufficiently or they are just "naturally sticky" or due to bad luck. Of course, this could also be a combination of a dying cell + one ore more healthy cells. The sky's the limit. If multiple cells and/or cell remnants were captured in the same droplet, the resulting UMI for that barcode will represent a somewhat random sampling from all the cells within the droplet. Note that this is the case for about 5% of droplets in conventional 10X Genomics runs.

Looking at those details, it now should be clear that barcodes that are associated with relatively low numbers of UMI might be instances where no or dying cell(s) were captured whereas barcodes with extraordinarily high numbers of UMI are more likely to be indicative of instances where multiple cells were captured. However, this is biology, so nothing is truly black and white, of course, and realistically, for barcodes associated with well above a dozen transcripts it can become a bit tricky to distinguish technical failures/issues from biological factors such as cell size and general mRNA content.

The EmptyDrops paper has good technical explanations. The OSCA book has a pretty detailed run-down of typical quality controls and what they mean.

ADD COMMENT
2
Entering edit mode

To address your question and to illustrate my point about ambiguity, here are two QC plots from different sample types from real-life single-cell data. As you can see, your density plot depiction would probably not look too different for both examples, but the implications that the additional information from the mito content provide are that one should perhaps employ different filtering strategies depending on the samples at hand. In short: check the mito content for your cells and see if those low-UMI-barcodes also have exceedingly high mito content (--> supporting that these would be low-quality transcriptomes).

enter image description here

ADD REPLY
0
Entering edit mode

Thank you very much for the great explanation! Really appreciate it!

ADD REPLY

Login before adding your answer.

Traffic: 3170 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6