**0**wrote:

What is the best value to assign for lfc threshold while using DESeq2 package? With 1 as lfc threshold, I got more than 3000 upregulated genes. Any suggestion please? Thanks

**30k**• written 9 months ago by rthapa •

**0**

Question: best value of lfc threshold

0

rthapa • **0** wrote:

What is the best value to assign for lfc threshold while using DESeq2 package? With 1 as lfc threshold, I got more than 3000 upregulated genes. Any suggestion please? Thanks

ADD COMMENT
• link
•
modified 9 months ago
by
Kevin Blighe ♦ **30k**
•
written
9 months ago by
rthapa • **0**

3

Kevin Blighe ♦ **30k** wrote:

In DESeq2, the 'lfc' values are on the log [base 2] scale (log2fc)..

This is an open-ended question. Ask 100 people and you'll get very different answers.

- Log2fc of 1 is equivalent to linear fold change of 2
- Log2fc of 2 is equivalent to linear fold change of 4
- Log2fc of 3 is equivalent to linear fold change of 8

Each person appears to choose a cut-off value that relates to whatever the first trusted person in their careers told them. The mistake that these people then make is in rigidly adhering to this cut-off and in thinking that it's the only answer. In some cases, people do not even use any cut-off for fold-change and just use adjusted P-values (Q values) and then rank the statistically significant genes based on fold-change. As I recall, the first trusted voice in my own career told me: '*FDR Q<0.05 and absolute log2fC>2*', but that was during a time when RNA-seq was not even available.

There really is no answer, though, and it depends on many factors, including:

- The normalisation type (with FPKM/RPKM, unrealistically large log2fc values will be observed; with quantile or geometric normalisation, as used in DESeq2, log2fc values will be lower than with FPKM counts and will be balanced between negative and positive fold-changes)
- how many genes you want to include for downstream analysis
- previous literature of how many transcripts to expect in such a comparison that you're conducting
- the adjusted P value that you are using for cut-off. For example, even at FDR Q<0.05 and log2fc=2, many of the transcripts will not be that much different when you visualise the normalised counts between your comparisons (this comment only has validity in certain experimental setups though)
- the variance of your data (high variance = unreliable log2fc values in any setting)

So, the message? - there is absolutely no standard cut-off. Use what is most appropriate for your data and what works best.

Kevin

sorry, why correlation between two samples goes two times higher when I perform geometric normalisation on my row counts? Is there any explanation please? I calculated Pearson correlation for two samples before and after normalisation wherein correlation went higher 2 times in normalised samples

1

The correlation value may have changed, but does the statistical significance of the correlation change? Use cor.test to check.

A short answer, too: there are different normalisation methods out there and they will produce data on different distributions. It is logical that statistical inferences from different normalisations will also be different. What you must ensure is that you choose the normalisation strategy that is most suitable for your data.

you alright, I am facing with a data sets with too many zeros and genes with low read counts, in another hand dataset is heterogeneous of two dataset with different distributions.

Please log in to add an answer.

Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.

Powered by Biostar
version 2.3.0

Traffic: 1937 users visited in the last hour