Producing Bulk samples from 10X data
Entering edit mode
11 weeks ago

Hi Everyone

I am aware about an approach that's called pseudobulking in single cell where bulk-like samples are generated from scRNAseq data (in absence of bulk data) to find which genes might be important at population level. But there is something my boss asked and I am not sure if that's a correct way to generate bulks.

I was asked to sample 60% of total reads from fastqs of 10X data (UMI data 3' chemistry) to generate three replicates per sample and then align them to plasmodium reference and use DESeq2 for DE analysis and check the overlap of DEG's with DEG's obtained from scrnaseq (all clusters combined). Now I did what was asked of me and I get the ideal biological replicates. But the dispersion estimate looks weird (I understand there will be no dispersion given that biological replicates are almost identical). I observe that nearly 66% of the genes detected are differentially expressed. Besides, out of total scrnaseq DEGs, 60% of them overlaps with these artificial bulk derived DEGs. So is this good.

I am confused if what I have been asked for is even legit or not?

enter image description here enter image description here

rnaseq scrnaseq deseq2 bulk seurat • 293 views
Entering edit mode
11 weeks ago
ATpoint 78k

The dispersion plot, as you say, is expected as you are creating pseudoreplication. The way paeudobulks are typically created is based on the count matrix. You sum raw counts per cluster, celltype, group, whatever makes sense. This pseudoreplication you create makes no sense to me. If you don't have replication you cannot make it up.


Login before adding your answer.

Traffic: 1933 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6