Question: best value of lfc threshold
gravatar for rthapa
3 months ago by
rthapa0 wrote:

What is the best value to assign for lfc threshold while using DESeq2 package? With 1 as lfc threshold, I got more than 3000 upregulated genes. Any suggestion please? Thanks

rna-seq • 204 views
ADD COMMENTlink modified 3 months ago by Kevin Blighe16k • written 3 months ago by rthapa0
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe16k
University College London Cancer Institute
Kevin Blighe16k wrote:

In DESeq2, the 'lfc' values are on the log [base 2] scale (log2fc)..

This is an open-ended question. Ask 100 people and you'll get very different answers.

  • Log2fc of 1 is equivalent to linear fold change of 2
  • Log2fc of 2 is equivalent to linear fold change of 4
  • Log2fc of 3 is equivalent to linear fold change of 8

Each person appears to choose a cut-off value that relates to whatever the first trusted person in their careers told them. The mistake that these people then make is in rigidly adhering to this cut-off and in thinking that it's the only answer. In some cases, people do not even use any cut-off for fold-change and just use adjusted P-values (Q values) and then rank the statistically significant genes based on fold-change. As I recall, the first trusted voice in my own career told me: 'FDR Q<0.05 and absolute log2fC>2', but that was during a time when RNA-seq was not even available.

There really is no answer, though, and it depends on many factors, including:

  • The normalisation type (with FPKM/RPKM, unrealistically large log2fc values will be observed; with quantile or geometric normalisation, as used in DESeq2, log2fc values will be lower than in FPKM and will be balanced between negative and positive fold-changes)
  • how many genes you want to include for downstream analysis
  • previous literature of how many transcripts to expect in such a comparison that you're conducting
  • the adjusted P value that you are using for cut-off. For example, even at FDR Q<0.05 and log2fc=2, many of the transcripts will not be that much different when you visualise the normalised counts between your comparisons (this comment only has validity in certain experimental setups though)
  • the variance of your data (high variance = unreliable log2fc values in any setting)

So, the message? - there is absolutely no standard cut-off. Use what is most appropriate for your data and what works best.


ADD COMMENTlink modified 3 months ago • written 3 months ago by Kevin Blighe16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 750 users visited in the last hour