Question: Subsetting before EdgeR differential gene expression
gravatar for createanotherone
3.1 years ago by
createanotherone0 wrote:

Hi all,

I am analysing an RNA-seq experiment with 8 treatment vs 8 control samples collected from primary tissue.

Having performed differential gene expression analysis (DGE) between these samples using an edgeR exact test and we noticed that genes that are specifically expressed in our cell type of interest (deduced through another separate study) are seemingly systematically reduced in expression in the treatment group.

We believe this is likely due to a difference in cell composition between the two samples despite due care taken in the collection procedure - resulting in more reads being consumed by genes specific to cells we aren't interested in reducing the number over the genes specific to the cells we are, therefore reducing the amount of data for these genes which is being falsely called as differential expression.

I was wondering, is it sound to subset the data to just the genes we are confident are specific to the cell type we want to investigate, normalise for the coverage across this gene set (like a pseudo-library size adjustment), and perform DGE just on this gene set? Perhaps with a conservative false discovery rate adjustment using the total number of expressed genes (not the number in the subset)?

Any advice would be greatly appreciated!



rna-seq R • 1.3k views
ADD COMMENTlink written 3.1 years ago by createanotherone0

I was wondering the same (although using DESeq instead). Have you found an answer elsewhere?

ADD REPLYlink written 2.3 years ago by lm68750
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 644 users visited in the last hour