edgeR sample values
1
2
Entering edit mode
7.8 years ago
cecilin92 ▴ 10

Anybody could help me explain values below? What I know is they are from the results of edgeR package and used for logFC calculation. I am a fresh man in R and thank you very much!

id               logFC        logCPM       PValue       FDR          A1        A2        A3        B1       B2       B3
c82096_g4_i5     -9.07576     5.512818     2.87E-54     4.46E-49     50.32     38.61     29.09     0        0        0.18
c76837_g1_i2     -12.4923     4.185135     5.83E-50     4.53E-45     12.22     11.85     11.96     0        0        0
c74136_g1_i4     -12.7555     4.456949     3.30E-49     1.71E-44     29.37     18.52     16.64     0        0        0
c82993_g3_i5     12.43865     4.169207     5.75E-47     2.23E-42     0         0         0         3.09     2.95     2.46

R RNA-Seq • 1.9k views
1
Entering edit mode

I'm guessing that someone just appended the normalized counts on the end of the results. The last few columns aren't standard output.

0
Entering edit mode

BTW, A1 A2 A3 represent the three replicates and i am wondering if the highlight values are 0, then how could we explain these? Thanks!

0
Entering edit mode

The fold-changes don't seem to correlate in any way with the highlighted values, though given that the dataframe produced by topTable is usually sorted, my guess is that the counts aren't sorted to correspond.

0
Entering edit mode

Yes I have sorted to wrong columns and sorry for my carelessness. This is the edgeR table from my professor who ask me to explain these numbers. However, after looking up edgeR tutorial I found it is hard to say how these number come from through calculation, what I need is explain how these highlight values can be changed into logFC.

3
Entering edit mode
7.8 years ago

The updated numbers make more sense. In this case, fold-changes like those you posted look reasonable. I'm more familiar with DESeq2, but in general these numbers are the result of maximum likelihood expectation after shrinking the variance. Once you start approaching infinity, the standard error of those values start sky rocketing, so you can take any values greater than 6 or 7 (i.e., a fold change greater than 100 or more) as "near infinite", at least unless they have some meaningful (i.e., non-zero) counts across groups.

BTW, you'd likely get smaller fold-changes with DESeq2, since the large values you're seeing are partly due to having only 3 samples per-group (i.e., they're not that reliable and would, thus, get shrunken a good bit, though they'd still be quite large).

Edit: I probably should have written "profile likelihood" above instead of "maximum likelihood", since I expect dispersion is just factored out.

0
Entering edit mode

So I might call these values as profile likelihood. Thanks a lot!