Question: DE genes and variability between biological replicates: edgeR
0
gravatar for mariannapauletto
5 weeks ago by
mariannapauletto20 wrote:

Hi guys,

I'm dealing with DE genes resulted from edgeR analysis comparing two experimental groups (n=3 per group) I noticed that some DEGs show high variability within replicates... Do you think that the program takes into account this aspect? Are these DEGs reliable or not? If not, how to filter the edgeR output for being more reliable? Is there a specific code in edgeR to cope with this issue?

Thanks for you help!

Best Marianna

rna-seq gene • 114 views
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by mariannapauletto20
1

Do you think that the program takes into account this aspect?

Yes, this is a key feature of (basically any) statistical approach.

Are these DEGs reliable or not?

Can you give a concrete example including the expression values and p-values of such a gene? Even if dispersion is higher for certain genes but relative expression is decent and fold changes are large it still can be statistically significant.

If not, how to filter the edgeR output for being more reliable?

You can always decrease FDR cutoff to be more conservative, but at the cost of false-negatives.

Please also share the code you used. Did you use filterByExpr?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by ATpoint29k

sorry I dd a mess with the comments, see the reply below

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by mariannapauletto20

Hi ATpoint,

thank you for your quick reply.

I'm speaking about the results of an analysis conducted in the past with an edgeR version lacking the filterByExpr command. Anyway I filtered data with the approach suggested by the manual of that version (see below)

This is the code:

>keep <- rowSums(cpm(dge)>0.5) >= 3

>dge <- dge[keep, , keep.lib.sizes=FALSE]

>dge_norm <- calcNormFactors(dge)

>dge_norm$samples

>design <- model.matrix(~0+Group)

>dge2 <- estimateDisp(dge_norm, design)

>fit <- glmFit(dge2,design)

>my.contrast<-makeContrasts(TreatvsControl.1h = Treated.1h-Control.1h,
TreatvsControl.6h = Treated.6h-Control.6h,
Interaction = (Treated.1h-Control.1h)-(Treated.6h-
Control.6h), levels=design)

>lrt1 <- glmLRT(fit,contrast=my.contrast[,"TreatvsControl.1h"])

This is an example of a DEG with a high sd (values are expressed in CPM) group 1: 0.256892 0.321829 0.06487 | group 2: 5.078998 0.367778 1.278966 logFC:3.28; logCPM: -0.088; LR:12.41; p-val: 0.00042; FDR: 0.075

ADD REPLYlink written 5 weeks ago by mariannapauletto20

Since I typically filter for FDR < 0.05 it would not significant in my eyes. The logCPM is also quite low. I would probably not trust it. If you still have the raw counts I would plug it into the current edgeR version, use filterbyexpr and also the glmQLF framework which is (from what I understand) what the developers currently recommend as the default approach. If you google glmLRT vs glmQLF you will find some posts at Bioconductor where they explain why they think it is superior oin most cases.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by ATpoint29k

Thank you for your reply.

That's clear. Actually, with recent datasets I've implemented exactly the same approach you mentioned (filterByExpr and glmQLF).

But: in your opinion, looking at the results obtained with the previous version and the code I used (including filtering by low expression CPM< 0.5 --> about 10-15 reads), is there any reason to conclude that significant genes (FC >1.5, FDR < 10%) are not reliable?

Yes I get the point of the FDR cutoff, but this only increases the probability of having false positives and this does not question the reliability of a single DEG. Isn't it?

Thank you very much for this fruitful discussion!

Best

Marianna

ADD REPLYlink written 5 weeks ago by mariannapauletto20

mariannapauletto : Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLYlink written 5 weeks ago by genomax78k

Thank you,

sorry for that

ADD REPLYlink written 5 weeks ago by mariannapauletto20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 766 users visited in the last hour