Question

Can I infer the fraction of replicating cells from bulk RNA-seq data?

1

Entering edit mode

7 months ago

txema.heredia ▴ 110

Hi,

This is a bit of a shot in the dark, but here goes nothing:

I have RNAseq from a cancer cell line. Control vs 2 treatments. Two timepoints at 12 and 48h. I have already run the standard pipeline of DESeq2 and enrichment of GO terms.

#DEG	down 12h	up 12h	down 48h	up 48h
treatment A	110	113	575	104
treatment B	36	108	1233	603

.

The analysis shows a significant enrichment of Cell Death in upregulated DEG and of Cell Division among downregulated DEG (in both treatments, and at both timepoints).

When comparing the two timepoints within a treatment, Cell Division is enriched among downregulated DEG, and there is no enrichment for Cell death. I also detect some small differences when comparing treatment A vs treatment B.

I have already been able to get an idea of the dynamics of each of the Biological Process separately. However, I don't know of any way to compare the "magnitude" of each of the two almost-mutually-exclusive processes going on within each sample.

I basically want to measure "how many cells are dying" vs "how many cells evaded death/treatment and are dividing again". Is that possible?

Is there any way where, using only this bulk RNAseq data, I can infer which fraction of the cells within each sample are dying / repairing / dividing?

replication apoptosis • 566 views

ADD COMMENT • link updated 7 months ago by Ram 43k • written 7 months ago by txema.heredia ▴ 110

1

Entering edit mode

I basically want to measure "how many cells are dying" vs "how many cells evaded death/treatment and are dividing again". Is that possible?

That's a classical example where you have to go back to the lab if you ask me. Have no in silico suggestion, sorry. Maybe some sort of deconvolution thing with a reference of genes related to cell cycle and proliferation.

ADD REPLY • link 7 months ago by ATpoint 82k

0

Entering edit mode

I was thinking precisely of the Seurat methods for calculating the cell-cycle-phase scores. However, I don't even know where to begin with the deconvolution, as there is no reference for my cell line (and it should be all a single cell type transfomed into a cancerous monstrosity).

From what I've been reading about gene set scoring methods, they are indeed not different than the fgsea methods that I have already used to measure enrichment in these samples.

I have just tried a very quick, dirty, and probably not adequate approach:

Normalize counts
log10(x+1) of the S, and G2M-phase genes
Calculate the Z-score among all samples (control, both treatments, x2 timepoints).
Plot

library(Seurat)

s.genes <- cc.genes$s.genes  
g2m.genes <- cc.genes$g2m.genes  

tb<-counts(ddsm,norm=F)  
ntb<- tb %*% diag ( mean(colSums(tb))/colSums(tb) )  

d_score<-cbind.data.frame(   
  sample=colnames(tb),  
  s=colSums(log10(ntb+1)[s.genes[s.genes %in% rownames(ntb)],]),  
  g2m=colSums(log10(ntb+1)[g2m.genes[g2m.genes %in% rownames(ntb)],])  
  )  

d_score <- left_join(d_score, mmd, by="sample")  

d_score$z_s <- (d_score$s - mean(d_score$s) ) / sd(d_score$s)  
d_score$z_g2m <- (d_score$g2m - mean(d_score$g2m) ) / sd(d_score$g2m)  

ggplot(d_score, aes(x=z_s, y=z_g2m, shape=timepoint, color=treatment)) +  
  geom_point() +  
  theme_minimal() +  
  theme(aspect.ratio = 1) +  
  geom_hline(yintercept = 0, linetype="dashed") +  
  geom_vline(xintercept = 0, linetype="dashed") +  
  labs(x="Z-score S-phase",y="Z-score G2/M-phase")

By doing so, I can clearly see a clear separation between treatments after 48h.

enter image description here

Is this method worth anything or is it completely flawed?

ADD REPLY • link updated 7 months ago by Ram 43k • written 7 months ago by txema.heredia ▴ 110

0

Entering edit mode

Side note: These are wonderful results. You can make a really strong case for scRNA-seq exploration here.

ADD REPLY • link 7 months ago by Ram 43k