Question: EdgeR (TMM): Samples with outlier but still show extremely low p-value and FDR
gravatar for Joe
2.5 years ago by
Joe30 wrote:

please see my data here:

These are not raw data but normalized after edgeR. I list the first few genes with highest fold change, and found one sample is definitely a outlier (highlight in yellow), which cause high fold change (If I remove this outlier, the fold change is only 2 fold-ish.) I am so surprised that the p value and FDR are both extremely small even with an outlier.

Is it common issue when use edgeR for differential expression?

If it is a real issue, how could I find out outlier if I have a large set samples (eg, >100 samples) for data analysis?

We usually use DEseq2 for DE, DEseq2 can identify outlier and report NA for p value.


ADD COMMENTlink modified 2.2 years ago by digrigor0 • written 2.5 years ago by Joe30


Did you resolve your problem?

I have similar a behaviour with use edgeR. If i have one outlier in one of my four biological replicates the program takes it as DE gene. I don't understand why this happen, but seem to be common

am thinking to changue to deseq2

ADD REPLYlink written 2.4 years ago by vm.higareda20

Hi swbarnes2,

I have exactly the same issue here. Genes that have an outlier value in one of the compared conditions are considered as DE by edgeR (small P-value and large abs(logFC)) and I am trying to figure out why.

So i calculated the log2 Fold Change based on the CPM mean values of the compared conditions and I figured out that it is similar to the one calculated by edgeR.

So edgeR's LogFC is similar to log(meanCPMa/meanCPMb) with the only difference that it is adjusted so genes with low counts do not usually have big abs(logFC). Maybe it would be useful to calculate the logFC of the CPM medians which accounts for the outlier samples. However if we use that, what's the point of using edgeR at all?

It would be really helpful if you could tell us what did you do eventually. Did you find any further solution? Did you switch to DESeq2?


ADD REPLYlink written 2.2 years ago by digrigor0

Please use the "ADD COMMENT" button to add comments.

ADD REPLYlink written 2.2 years ago by Devon Ryan95k
gravatar for swbarnes2
2.4 years ago by
United States
swbarnes27.9k wrote:

A tiny p-value means that the software is very sure the difference between the groups is real. It has nothing at all to do with how large the difference itself is.

ADD COMMENTlink written 2.4 years ago by swbarnes27.9k

But even if you see one replica is an outliers as in your example?¿ Did you trust in that gene?

ADD REPLYlink written 2.4 years ago by vm.higareda20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1612 users visited in the last hour