Can I infer the fraction of replicating cells from bulk RNA-seq data?
0
1
Entering edit mode
7 months ago
txema.heredia ▴ 110

Hi,

This is a bit of a shot in the dark, but here goes nothing:

I have RNAseq from a cancer cell line. Control vs 2 treatments. Two timepoints at 12 and 48h. I have already run the standard pipeline of DESeq2 and enrichment of GO terms.

#DEG down 12h up 12h down 48h up 48h
treatment A 110 113 575 104
treatment B 36 108 1233 603

.

The analysis shows a significant enrichment of Cell Death in upregulated DEG and of Cell Division among downregulated DEG (in both treatments, and at both timepoints).

When comparing the two timepoints within a treatment, Cell Division is enriched among downregulated DEG, and there is no enrichment for Cell death. I also detect some small differences when comparing treatment A vs treatment B.

I have already been able to get an idea of the dynamics of each of the Biological Process separately. However, I don't know of any way to compare the "magnitude" of each of the two almost-mutually-exclusive processes going on within each sample.

I basically want to measure "how many cells are dying" vs "how many cells evaded death/treatment and are dividing again". Is that possible?

Is there any way where, using only this bulk RNAseq data, I can infer which fraction of the cells within each sample are dying / repairing / dividing?

replication apoptosis • 566 views
ADD COMMENT
1
Entering edit mode

I basically want to measure "how many cells are dying" vs "how many cells evaded death/treatment and are dividing again". Is that possible?

That's a classical example where you have to go back to the lab if you ask me. Have no in silico suggestion, sorry. Maybe some sort of deconvolution thing with a reference of genes related to cell cycle and proliferation.

ADD REPLY
0
Entering edit mode

I was thinking precisely of the Seurat methods for calculating the cell-cycle-phase scores. However, I don't even know where to begin with the deconvolution, as there is no reference for my cell line (and it should be all a single cell type transfomed into a cancerous monstrosity).

From what I've been reading about gene set scoring methods, they are indeed not different than the fgsea methods that I have already used to measure enrichment in these samples.

I have just tried a very quick, dirty, and probably not adequate approach:

  1. Normalize counts
  2. log10(x+1) of the S, and G2M-phase genes
  3. Calculate the Z-score among all samples (control, both treatments, x2 timepoints).
  4. Plot
library(Seurat)

s.genes <- cc.genes$s.genes  
g2m.genes <- cc.genes$g2m.genes  

tb<-counts(ddsm,norm=F)  
ntb<- tb %*% diag ( mean(colSums(tb))/colSums(tb) )  

d_score<-cbind.data.frame(   
  sample=colnames(tb),  
  s=colSums(log10(ntb+1)[s.genes[s.genes %in% rownames(ntb)],]),  
  g2m=colSums(log10(ntb+1)[g2m.genes[g2m.genes %in% rownames(ntb)],])  
  )  

d_score <- left_join(d_score, mmd, by="sample")  

d_score$z_s <- (d_score$s - mean(d_score$s) ) / sd(d_score$s)  
d_score$z_g2m <- (d_score$g2m - mean(d_score$g2m) ) / sd(d_score$g2m)  

ggplot(d_score, aes(x=z_s, y=z_g2m, shape=timepoint, color=treatment)) +  
  geom_point() +  
  theme_minimal() +  
  theme(aspect.ratio = 1) +  
  geom_hline(yintercept = 0, linetype="dashed") +  
  geom_vline(xintercept = 0, linetype="dashed") +  
  labs(x="Z-score S-phase",y="Z-score G2/M-phase")

By doing so, I can clearly see a clear separation between treatments after 48h.

enter image description here

Is this method worth anything or is it completely flawed?

ADD REPLY
0
Entering edit mode

Side note: These are wonderful results. You can make a really strong case for scRNA-seq exploration here.

ADD REPLY

Login before adding your answer.

Traffic: 2535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6