DESeq2 analysis with high number of samples
0
0
Entering edit mode
2.2 years ago
Kumar ▴ 140

Hi all,

I am performing RNA Seq data analysis in order to identify differential gene expression analysis on a large number of samples (~200). Initially, I ran STAR with 40 samples (20 affected and 20 unaffected). I got a featureCount matrix. Now, I am performing DESeq2, but once I do the DEseq2 analysis with any of 6 samples (3 affected and 3 unaffected), it shows the some genes up and down regulation but once I increased the number of samples like 8 or 10 for DESeq2, it is not able to show the differential expression.

Please see the PCAplots. PCA plot2 which shows only 6 datasets in total (3 affected, 3 unaffected), datasets are separated better (still not the best) where I find Up and Down genes with small numbers of samples but there is no expression with 40 samples. Please see below statistical analysis and PCAs.

How we can do DGE if we have more samples.

DESeq2 analysis with 6 samples:

> summary (res)
out of 20707 with nonzero total read count
LFC > 0 (up)       : 27, 0.13%
LFC < 0 (down)     : 59, 0.28%
outliers [1]       : 0, 0%
low counts [2]     : 17677, 85%


DESeq2 analysis with 40 samples:

> summary (res)
out of 42196 with nonzero total read count
LFC > 0 (up)       : 0, 0%
LFC < 0 (down)     : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%

RNA-Seq DESEQ2 next-gen • 857 views
0
Entering edit mode

You will never know until you add all 200 samples, I would suggest doing that instead of dipping your toe in with small sample sizes. Judging by your 1st PCA plot, it is not surprising there are no DE genes, however maybe the other 160 samples will separate better on the PCA plot (:

Have a read of the thread below, they suggest some troubleshooting. No differentially expressed genes using DESeq2

0
Entering edit mode

I think it is understandable not to be able to show differential gene expression with a high number of samples since these are randomly picked (mix of samples) from the patients and normal. Yes! I would run all 200 samples and see.

0
Entering edit mode

You are posting this question prematurely, it seems. Please proceed with your experiment and then return if there are still 'issues'. By the way, just looking at your first PCA bi-plot, I would regard the control sample on the right as an outlier, and remove it. However, it may not appear as an outlier once you have profiled more samples.

Traffic: 819 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.