Question

Limma couldn't find the differential gene

0

Entering edit mode

8 months ago

yoser4 ▴ 10

Hi, I am using limma for differential gene analysis of RNA seq results. I have encountered the following issues: I cannot find any genes with significant padj values (less than 0.05). Although some genes have extremely high logFC values. (I use TPM values for difference analysis)

But when I use DESeq to perform differential analysis on the same sample (using counts), I can normally obtain a set of differentially expressed genes.

I want to know what is causing this situation. All help will be greatly appreciated!

I will attach the limma code I used below.

Limma • 575 views

ADD COMMENT • link 8 months ago by yoser4 ▴ 10

0

Entering edit mode

exprSet=normalizeBetweenArrays(exprSet)

dat <- exprSet

design=model.matrix(~factor( group_list ))

fit=lmFit(dat,design)

fit=eBayes(fit)

options(digits = 4)

topTable(fit,coef=2,adjust='BH')

bp=function(g){
  library(ggpubr)
  df=data.frame(gene=g,stage=group_list)
  p <- ggboxplot(df, x = "stage", y = "gene",
                 color = "stage", palette = "jco",
                 add = "jitter")
  p + stat_compare_means()
}

deg=topTable(fit,coef=2,adjust='BH',number = Inf)

ADD REPLY • link updated 8 months ago by GenoMax 141k • written 8 months ago by yoser4 ▴ 10

score 0 · Answer 1 · 2023-08-21

0

Entering edit mode

8 months ago

ATpoint 82k

There is a lot non-standard here. First, if you say TPM then this is RNA-seq most likely, so normalizeBetweenArrays() as normalization is odd since it's meant for arrays, not counts. Then TPM by itself is not a good choice for DE analysis (many threads on it, please google to learn more). Also, limma for RNA-seq if using pre-normalized counts typically uses the trend-pipeline, see limma vignette. Since you seem to have counts available why not just following either the DESeq2 manual, or the limma (limma-voom) manual to do the analysis rather than this custom approach you do here (which in part is wrong I would say).

ADD COMMENT • link 8 months ago by ATpoint 82k

0

Entering edit mode

This is indeed RNA seq data, as I mentioned earlier. Regarding why standardization is carried out, I believe that TPM is aimed at standardizing the effective length of genes, while the parameter "normalizeBetweenArrays()" is aimed at standardizing between different samples (eliminating batch effects). I think there is a difference between the two. Secondly, I carefully reviewed the results of my limma differential analysis and found that using a p-value (<0.05) for filtering can obtain differential genes, with approximately two thousand different genes. But we got nothing by filtering through Padj. Now I believe that some reason is causing my Padj value to be abnormal. But I am not good at statistics.

ADD REPLY • link 8 months ago by yoser4 ▴ 10