Expression of gene of interest across conditions in single-cell data
Entering edit mode
2.6 years ago
kz • 0

Hi everyone,

I have a tpm gene expression matrix (single-cell RNA-seq data) from a publication which I would like to use to explore expression of my gene of interest. The single-cell data consists of cells from 10 conditions. I'd like to look at expression of my gene of interest in these conditions. Ultimately I'd like to create a visualisation showing the proportion of cells in each condition that express my gene of interest.

I've created a SingleCellExperiment object from the tpm counts, but I'm not sure how to determine the proportion of cells in each condition that express my gene of interest, what cutoff to use for expression, etc. Additionally, as well as proportion of cells, is there a good way to determine average expression of the gene for each condition? Something simple like a bar plot is all I wish to produce. Any recommended packages etc? Any advice is greatly appreciated!

Thank you!

Edit to add example data:

# SCE object 

class: SingleCellExperiment 
dim: 31211 2098 
assays(1): tpm
rownames(31211): ENSMUSG00000000001 ENSMUSG00000000003 ... ENSMUSG00000118639
rowData names(2): V1 V2
colnames(2098): SRR5993298 SRR5993299 ... SRR5995502 SRR5995505
colData names(2): V1 sample_type

# sample_type stored in colData is condition for each cell/run (SRRxxxx) in same order as colnames 

659 CK_2w_m2            
663 CK_2w_m2            
664 CK_2w_m2            
665 CK_2w_m2            
666 CK_2w_m2            
667 CK_2w_m2
singlecell scater tpm seurat Rna-seq • 1.8k views
Entering edit mode
2.6 years ago

Take a look at dittoSeq for viz options - it'll work out of the box for your SCE object (or a Seurat object or SummarizedExperiment). For comparing a gene between conditions, you will likely want to show violin and/or boxplots rather than just the average (see dittoPlot). You could also consider a heatmap (dittoHeatmap) or dotplot (dittoDotPlot) if you want to show many genes in the same figure.

As for proportion of cells expressing a given gene, single cell RNA-seq data tends to be very sparse, particularly 10X datasets. If you only have 50-100k reads per cell, many genes that are expressed will have no reads align and of those that do, many will have only one or two reads align. As such, I'd consider a gene expressed if it has any reads aligning. This is what dot plots tend to do by default, so that may be a good way to visualize the change in proportions between your conditions.

Entering edit mode

Thank you it looks like dittoPlot is exactly what I need!

I'm running into an issue with it however, when I run dittoPlot R runs for a while and eventually crashes...Not sure if this is something to do with how I've stored "sample_type" in the SCE object.

dittoPlot(sce, "ENSMUSG00000000001", = "sample_type")
Entering edit mode

Does it spit out an error message? That should work, but you can also open an issue on the Github.

Entering edit mode

Figured it out, thank you!!


Login before adding your answer.

Traffic: 1990 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6