Question: best criteria to find DE genes in RNA-seq analysis
0
gravatar for hougiotaejut
2.4 years ago by
hougiotaejut10
hougiotaejut10 wrote:

Hi

I know I know I know that my question may seem silly to you. And I know I'm not as professional as you are, but please note that I HAVE RESEARCHED TO FIND THE BEST ANSWER, but I don't know if my keywords weren't appropriate because the more I searched on google and specifically on biostars, the more I feel confused.

I have a list of P-values, Log2FoldChanges and Standard Deviations (SD) of Log2FoldChanges for each gene in a time-course RNA-seq study. I'm going to find the DE genes based on the info I have been provided. According to my researches, some say that we should use the adjusted p-values to find the DEG, which means I should use the command "p.adjust" in R to adjust my current p-values. But I have also seen that some people use FDR and FC. On the other hand, everyone recommends a different criteria to find DEG. One says, "find DEG with adjusted p-values<0.1". Another one recommends using adjusted p-values<0.05.

I'm getting confused that,

1- which one is the best solution to find DEG? P-values, adjusted p-values, FDR, FC? I'm not sure whether FDR and FC could be used to find DEG. I just repeat what I have read.

2- whatever you recommend me in question number 1, what criteria do you recommend? 0.05? 0.1? etc.

3- if P-value (or its adjusted value) is enough, what is log2FoldChange and its SD good for? why is it provided beside P-values by the package I'm using?

I apologize if my question is so basic and may bother you. And thanks for your help.

ADD COMMENTlink modified 2.4 years ago by Devon Ryan90k • written 2.4 years ago by hougiotaejut10
2

adj.P < 0.05 is typical; you shouldn't need to adjust the p-values if this is done by the package you are using to call differential expression, be it DESeq2 or edgeR or limma or whatever. log2 FC will give you the expected direction of change. Some people filter their DEGs based on |log2_FC| > 0.5 or something but this is usually unnecessary in my experiments.

ADD REPLYlink written 2.4 years ago by russhh4.4k

Thank you so much for the info. The package gives me the p-values, not the adjusted ones, so I think I have to adjust them as you say adj p-values are typical. Is bonferroni adjustment appropriate?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by hougiotaejut10
1

Hi, Just for your information, the FDR is an adjusted pvalue. Bonferroni is a bit more stringent than FDR but at this point it shouldn't really matter which method you use.

ADD REPLYlink written 2.4 years ago by Carlo Yague4.4k

Oh, I get it now. FDR is the short term for FDR-adj-pvalue? I spotted an option in "p.adjust" command in R which could use the method "FDR" to adjust P-values. Good point. Thank you so much.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by hougiotaejut10
1

You could think of the cutoff for an adjusted p-value (of 0.05 or 0.1) as a cost function. It depends on what you want to do with the data downstream, and how "expensive" it is to tolerate false positive findings.

ADD REPLYlink written 2.4 years ago by WouterDeCoster39k
3
gravatar for Devon Ryan
2.4 years ago by
Devon Ryan90k
Freiburg, Germany
Devon Ryan90k wrote:
  1. Adjusted p-value, further filtering by fold-change if needed (i.e., if you have way too many results to handle or there are far too many with small fold-changes).
  2. Either 0.1 or 0.05, depending on what you want to do next, how much noise you can tolerate, and the number of significant genes you're getting.
  3. A log2FC of 0.1 (as an example) is unlikely to be biologically relevant, so it can be useful to remove results that won't be useful.
ADD COMMENTlink written 2.4 years ago by Devon Ryan90k
1

Another useful way of doing this (log2 fc filtering) is changing null hypothesis "log2 fold changes are equal to zero" to 0.1 or 0.2. This will take care of point 3 in a statistical manner. I know this can be done in DESeq2 but not sure about other packages.

DESeq2::results(object = dds, lfcThreshold = 0.2)

ADD REPLYlink written 2.4 years ago by poisonAlien2.8k

Thank you so much. I need to compare two models to check which one is more capable of detecting DE genes. So, I think adj p-values are enough as you say. right? because if I have understood you right, further filtering by fold change in case there are too many results, is for when I want to answer a biological question. I my case, comparing two DEG detection models, adj p-value is enough. Am I right? And is bonferroni adjustment appropriate?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by hougiotaejut10
2

If youre comparing two models "to see which one is more capable of detecting DE genes" you have to be careful of false-positives, and overall accuracy. You could make a terrible model that just reports p=0.0001 for every second gene and it will "be more capable of detecting DE genes" than any accurate system. Try to make a known-truth list of genes first.

ADD REPLYlink written 2.4 years ago by karl.stamm3.5k
1

What karl.stamm wrote. Regarding bonferroni correction, it's overly conservative. The default BH method in p.adjust() is superior.

ADD REPLYlink written 2.4 years ago by Devon Ryan90k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1657 users visited in the last hour