Question

Comparing lists of differentially expressed genes

1

Entering edit mode

7.8 years ago

ra381 ▴ 10

I've been calculating differential expression for two separate groups both of which have baseline and treatment expression measurements. I have looked at differential expression between the baseline and treatment and now have 2 lists of differentially expressed genes. Differentially expressed genes were identified with edgeR with appropriate correction for FDR.

It's an interesting question to compare the lists of differentially expressed genes and I can identify genes significantly up- and down-regulated in both groups. I can also show that the overlap in these lists is significant using Fisher's exact test.

However, when I look at genes that are significantly differentially expressed in only one group do I need to do any further analysis or test to ensure the difference is "real"? For example, one gene could be significantly DE in one group close to the cutoff for significant (e.g. P = 0.049) but just misses the cutoff in the second group (e.g. P = 0.051). I've been unable to find anything about this online so far.

RNA-Seq R edgeR differential expression • 5.3k views

ADD COMMENT • link updated 2.7 years ago by antass ▴ 30 • written 7.8 years ago by ra381 ▴ 10

1

Entering edit mode

7.8 years ago

EagleEye 7.5k

Gene ontology comparsion between up- and down-regulated genes

ADD COMMENT • link 7.8 years ago by EagleEye 7.5k

score 3 · Accepted Answer · 2016-07-08

3

Entering edit mode

7.8 years ago

Devon Ryan 104k

I'd encourage you to use GSEA or a similar rank-based method for comparisons rather than choosing some p-value cut-off and comparing lists.

ADD COMMENT • link 7.8 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks for the quick and interesting answer. I've had a look at a few methods including GSEA and RRHO (http://nar.oxfordjournals.org/content/38/17/e169) and I think I'll give this a go. Do you think it would be appropriate to just rank my 2 lists of genes (group1 and group2) by logFC and run them through RRHO?

ADD REPLY • link 7.8 years ago by ra381 ▴ 10

0

Entering edit mode

I forget example how it works, but somewhere out there is a method describing how to combine fold-change and p-value, since they're not perfectly correlated.

ADD REPLY • link 7.8 years ago by Devon Ryan 104k

0

Entering edit mode

This is an old thread, but a good rule of thumb for ranking genes for use in GSEA is to use this formula:

sign(log fold change) * -log10(unadjusted p-value)

ADD REPLY • link 4.5 years ago by antass ▴ 30

0

Entering edit mode

Hi, I am just taking the opportunity of this old thread. My question is based on @antass's reply. Is it just the multiplication of the "sign" of the logfoldchange and the -log10(unadjusted p-value?). Thanks in advance.

ADD REPLY • link 3.8 years ago by n.naharfancy ▴ 10

0

Entering edit mode

Sorry, another year had passed! Yes, the sign is just the "direction" of the fold change, so that part fo the formula ultimately becomes either -1 or 1. The formula doesn't take into consideration the magnitude of the fold change.

ADD REPLY • link 2.7 years ago by antass ▴ 30