Question: DE genes and variability between biological replicates: edgeR
0
gravatar for mariannapauletto
10 months ago by
mariannapauletto40 wrote:

Hi guys,

I'm dealing with DE genes resulted from edgeR analysis comparing two experimental groups (n=3 per group) I noticed that some DEGs show high variability within replicates... Do you think that the program takes into account this aspect? Are these DEGs reliable or not? If not, how to filter the edgeR output for being more reliable? Is there a specific code in edgeR to cope with this issue?

Thanks for you help!

Best Marianna

rna-seq gene • 267 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by mariannapauletto40
1

Do you think that the program takes into account this aspect?

Yes, this is a key feature of (basically any) statistical approach.

Are these DEGs reliable or not?

Can you give a concrete example including the expression values and p-values of such a gene? Even if dispersion is higher for certain genes but relative expression is decent and fold changes are large it still can be statistically significant.

If not, how to filter the edgeR output for being more reliable?

You can always decrease FDR cutoff to be more conservative, but at the cost of false-negatives.

Please also share the code you used. Did you use filterByExpr?

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint41k

sorry I dd a mess with the comments, see the reply below

ADD REPLYlink modified 10 months ago • written 10 months ago by mariannapauletto40

Hi ATpoint,

thank you for your quick reply.

I'm speaking about the results of an analysis conducted in the past with an edgeR version lacking the filterByExpr command. Anyway I filtered data with the approach suggested by the manual of that version (see below)

This is the code:

>keep <- rowSums(cpm(dge)>0.5) >= 3

>dge <- dge[keep, , keep.lib.sizes=FALSE]

>dge_norm <- calcNormFactors(dge)

>dge_norm$samples

>design <- model.matrix(~0+Group)

>dge2 <- estimateDisp(dge_norm, design)

>fit <- glmFit(dge2,design)

>my.contrast<-makeContrasts(TreatvsControl.1h = Treated.1h-Control.1h,
TreatvsControl.6h = Treated.6h-Control.6h,
Interaction = (Treated.1h-Control.1h)-(Treated.6h-
Control.6h), levels=design)

>lrt1 <- glmLRT(fit,contrast=my.contrast[,"TreatvsControl.1h"])

This is an example of a DEG with a high sd (values are expressed in CPM) group 1: 0.256892 0.321829 0.06487 | group 2: 5.078998 0.367778 1.278966 logFC:3.28; logCPM: -0.088; LR:12.41; p-val: 0.00042; FDR: 0.075

ADD REPLYlink written 10 months ago by mariannapauletto40

Since I typically filter for FDR < 0.05 it would not significant in my eyes. The logCPM is also quite low. I would probably not trust it. If you still have the raw counts I would plug it into the current edgeR version, use filterbyexpr and also the glmQLF framework which is (from what I understand) what the developers currently recommend as the default approach. If you google glmLRT vs glmQLF you will find some posts at Bioconductor where they explain why they think it is superior oin most cases.

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint41k

Thank you for your reply.

That's clear. Actually, with recent datasets I've implemented exactly the same approach you mentioned (filterByExpr and glmQLF).

But: in your opinion, looking at the results obtained with the previous version and the code I used (including filtering by low expression CPM< 0.5 --> about 10-15 reads), is there any reason to conclude that significant genes (FC >1.5, FDR < 10%) are not reliable?

Yes I get the point of the FDR cutoff, but this only increases the probability of having false positives and this does not question the reliability of a single DEG. Isn't it?

Thank you very much for this fruitful discussion!

Best

Marianna

ADD REPLYlink written 10 months ago by mariannapauletto40

mariannapauletto : Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLYlink written 10 months ago by genomax92k

Thank you,

sorry for that

ADD REPLYlink written 10 months ago by mariannapauletto40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1083 users visited in the last hour