Subsetting before EdgeR differential gene expression
0
0
Entering edit mode
8.2 years ago

Hi all,

I am analysing an RNA-seq experiment with 8 treatment vs 8 control samples collected from primary tissue.

Having performed differential gene expression analysis (DGE) between these samples using an edgeR exact test and we noticed that genes that are specifically expressed in our cell type of interest (deduced through another separate study) are seemingly systematically reduced in expression in the treatment group.

We believe this is likely due to a difference in cell composition between the two samples despite due care taken in the collection procedure - resulting in more reads being consumed by genes specific to cells we aren't interested in reducing the number over the genes specific to the cells we are, therefore reducing the amount of data for these genes which is being falsely called as differential expression.

I was wondering, is it sound to subset the data to just the genes we are confident are specific to the cell type we want to investigate, normalise for the coverage across this gene set (like a pseudo-library size adjustment), and perform DGE just on this gene set? Perhaps with a conservative false discovery rate adjustment using the total number of expressed genes (not the number in the subset)?

Any advice would be greatly appreciated!

Thanks

Scott

RNA-Seq R • 2.6k views
ADD COMMENT
0
Entering edit mode

I was wondering the same (although using DESeq instead). Have you found an answer elsewhere?

ADD REPLY

Login before adding your answer.

Traffic: 2334 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6