Limma couldn't find the differential gene
1
0
Entering edit mode
8 months ago
yoser4 ▴ 10

Hi, I am using limma for differential gene analysis of RNA seq results. I have encountered the following issues: I cannot find any genes with significant padj values (less than 0.05). Although some genes have extremely high logFC values. (I use TPM values for difference analysis)

But when I use DESeq to perform differential analysis on the same sample (using counts), I can normally obtain a set of differentially expressed genes.

I want to know what is causing this situation. All help will be greatly appreciated!

I will attach the limma code I used below.

Limma • 575 views
ADD COMMENT
0
Entering edit mode
exprSet=normalizeBetweenArrays(exprSet)

dat <- exprSet

design=model.matrix(~factor( group_list ))

fit=lmFit(dat,design)

fit=eBayes(fit)

options(digits = 4)

topTable(fit,coef=2,adjust='BH')

bp=function(g){
  library(ggpubr)
  df=data.frame(gene=g,stage=group_list)
  p <- ggboxplot(df, x = "stage", y = "gene",
                 color = "stage", palette = "jco",
                 add = "jitter")
  p + stat_compare_means()
}

deg=topTable(fit,coef=2,adjust='BH',number = Inf)
ADD REPLY
0
Entering edit mode
8 months ago
ATpoint 82k

There is a lot non-standard here. First, if you say TPM then this is RNA-seq most likely, so normalizeBetweenArrays() as normalization is odd since it's meant for arrays, not counts. Then TPM by itself is not a good choice for DE analysis (many threads on it, please google to learn more). Also, limma for RNA-seq if using pre-normalized counts typically uses the trend-pipeline, see limma vignette. Since you seem to have counts available why not just following either the DESeq2 manual, or the limma (limma-voom) manual to do the analysis rather than this custom approach you do here (which in part is wrong I would say).

ADD COMMENT
0
Entering edit mode

This is indeed RNA seq data, as I mentioned earlier. Regarding why standardization is carried out, I believe that TPM is aimed at standardizing the effective length of genes, while the parameter "normalizeBetweenArrays()" is aimed at standardizing between different samples (eliminating batch effects). I think there is a difference between the two. Secondly, I carefully reviewed the results of my limma differential analysis and found that using a p-value (<0.05) for filtering can obtain differential genes, with approximately two thousand different genes. But we got nothing by filtering through Padj. Now I believe that some reason is causing my Padj value to be abnormal. But I am not good at statistics.

ADD REPLY

Login before adding your answer.

Traffic: 2265 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6