Question: Differential analysis show different results with edgeR and in box plot with t-test
0
gravatar for newbie
16 months ago by
newbie70
newbie70 wrote:

Hi,

I have a dataset with 159 tumors and 113 normal samples. I did differential analysis using edgeR and selected differential expressed genes based on Foldchange > 2 and FDR < 0.05 (Tumors vs Normal). From differentially expressed genes I selected upregulated genes based on positive logFC. Among the upregulated genes I could see a gene FAP which I'm interested in.

So, FAP is upregulated gene in Tumors when compared to Normal samples.

But when I plotted the expression (logCPM) of gene FAP between Tumors and Normal samples I see that p-value is significant but shows that expression is higher in Normal samples. Here is the box plot.enter image description here

Why this gene is upregulated in Tumors with edgeR and in the box plot it shows higher expression in Normals? Why so different in different analysis? Anything wrong?

P.S. I calculated logCPM after filtering out low expressed genes

ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 16 months ago by newbie70
1
gravatar for Benn
16 months ago by
Benn8.0k
Netherlands
Benn8.0k wrote:

A few comments, edgeR is not using logCPM for testing but counts. These counts are kind of normalized within the model, so it is a bit behind the screen what really happens there (model based normalization). Second comment, you state that normal is higher in your boxplot, but you refer to the median values right? Did you also calculate the means?

ADD COMMENTlink written 16 months ago by Benn8.0k

Yes, edgeR is not using logCPM. And yes referring to the median values I say that expression is higher in Normals compared to tumors. Is this not the right way to say higher or lower? No I didn't calculate the mean.

ADD REPLYlink written 16 months ago by newbie70
2

Judging from your boxplot, I think mean values would be higher in tumor vs normal, that's my point.

ADD REPLYlink written 16 months ago by Benn8.0k

Oh yes I see the mean of Tumors is higher compared to Normals.

# A tibble: 2 x 4
  Type   count  mean    sd
  <chr>  <int> <dbl> <dbl>
1 Normals   113  3.08  1.26
2 Tumors    159  3.90  3.03

But in the box plots, usually the higher or lower is said based on median right? Or I'm wrong?

ADD REPLYlink written 16 months ago by newbie70

The black horizontal bar in the middle of the box is median. edgeR is not using boxplots for analysis.

ADD REPLYlink written 16 months ago by Benn8.0k

Yes ofcourse edgeR doesn’t use boxplots. But with a t-test when u make a boxplot and if the median is like in the above plot, do you consider mean or median to say which group is higher?

ADD REPLYlink written 16 months ago by newbie70
3

t-test uses mean values, see here.

ADD REPLYlink written 16 months ago by Benn8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1711 users visited in the last hour