log2 fold change in RNA-seq analysis
2
1
Entering edit mode
2.5 years ago
wmsalsah ▴ 10

HI everyone, In RNA-seq analysis. for Desq2 I have chosen adj-pvalue <0.05 and log2 fold change > 1 to find differential expressed genes. My question is when I choose log2 fold change > 1 will give up regulated genes, so what about the down regulated genes? Do I need to specify log2 fold change < -1 ?!

fold log2 change • 7.9k views
0
Entering edit mode

Good day everyone,

Sorry I found a bit confusing and would appreciate if I could clarify on the calculation. I am using DESeq. My desired fold change is >2 (for up regulated DEGs) and <-2 (for down regulated DEGs). On the x axis for log2 (fold change), it is plotted as log2 (fold change) of -1 and 1. For up regulated genes, log2 (2)=1. But for down regulated gene, log 2 (0.5)= -1.

Does my fold change for down regulated DEGs is now switch to 0.5 instead of 2? If I insert log 2 (-2) it will become an error...

1
Entering edit mode

0
Entering edit mode

Thank you for your response. I apologize if my understanding is incomplete. I would like to ask: if I have a fold change of -2, how can I calculate the log2 fold change correctly? I understand that simply taking the log2 of a negative number will result in an error.

1
Entering edit mode

There is no such thing as a fold change of -2. Fold change is

expression_in_condition_A/expresssion_in_condition_B


as expression is always positive, then this ratio is always positive.

Perhaps what you mean by an fold change of -2 is a halfing of expression between treatment and control? (i.e. expression is twice as high in control as it is in treatement). In this case the fold-change is 0.5, not -2.

0
Entering edit mode

He means that the result of DEA will contain DEGs of A vs B. He gets LFC numbers in one column and positive LFC numbers will mean positive in A and negative LFC numbers mean that given gene is positive for B.

0
Entering edit mode

Thank you for your prompt reply Sir! I think I finally understand the concept. Please correct me if I'm wrong:

Let's say we are comparing the expression of gene A in treated to control samples:

A fold change of 2 means that gene A expression is twice as high in the treated sample compared to the control. [ 2 fold up-regulated in treated ]

A fold change of 0.5 means that gene A expression is half as much in the treated sample compared to the control. [ 2 fold down-regulated in treated ]

I was confused by the fold change stated here, which is greater than 2 and lesser than -2

0
Entering edit mode

Your understanding is correct. That attached highlighted sentence is wrong.

0
Entering edit mode

Thank you so much for the confirmation Sir! Appreciate that.

I think the paragraph was trying to say that:

a fold change greater than 2 indicates up-regulated genes [log2 (2)=1],

while a fold change less than 1 (eg. 0.5) indicates down-regulated genes [log2 (0.5)= -1]

4
Entering edit mode
2.5 years ago

It depends on how you are specifying the log2FoldChange. If you are passing this to the "lfcThreshold" argument of the DESeq2 results function, then no, you don't need to explicityly pass -1 as well, because results uses this to do |log(foldchange)| > 1.

However, if you are manually filtering the output table, then yes, you will need to look for both log2FoldChange > 1 and log2FoldChange < -1, or simply abs(log2FoldChange) > 1.

0
Entering edit mode

Thank you i.sudbery, its very helpful. I used abs(log2FoldChange) from Desq2. I just wonder, why abs(log2FoldChange) >1 not 2 for example?

2
Entering edit mode

A log2FoldChange of 2 means a fold change of 4 because Log2(4) = 2. That means you're testing if a gene's expression is 4 times greater in your treatment compared to your control.

A log2FoldChange of 1 means a fold change of 2 because Log2(2) = 1. That means you're testing if a gene's expression is 2 times greater in your treatment compared to your control.

Same thing in the other direction: A log2FoldChange of -2 means a fold change of 1/4 because log2(1/4) = -2, therefore you're testing if a gene's expression in your treatment is 1/4th that of your control.

With that in mind, whether you want you to test a 4 fold change or a 2 fold change is up to you.

0
Entering edit mode

Yep, it really is just a judgement call, balancing what you know about the biological system, with what you want to do with the resulting gene list and how many genes you get at each threshold.

2
Entering edit mode
2.5 years ago

As an addendum to Ian's excellent answer, you should really utilize the lfcThreshold argument if you want to do this rather than post hoc filtering the results, as the latter is essentially rendering your p-values meaningless by changing the effect size cutoff. Utilizing the lfcThreshold argument gracefully takes this into account, altering what's actually being tested and adjusting p-values accordingly.

So in short, using it changes the question (null hypothesis) being tested from "Are these genes significantly differentially expressed, assuming a difference of 0 between groups?" to "Are these genes significantly differentially expressed, beyond a two-fold difference between groups?"