Calculate fold change in edgeR with one sample per condition
2
1
Entering edit mode
22 months ago

Hi,

We have run a pilot RNA-Seq study with one sample per condition, this is just a test run. I understand there is no valid statistical test in this case, however just curious to obtain differential expression through edgeR package in R assuming dispersion = 0.4 for the human data. I have a normal (baseline) sample followed by 5 different samples. When I run the edgeR package, I want to indicate my normal sample (as baseline), however, I am unsure what sample here is taken as a baseline for calculation of fold change.

FC calculation

FC = Normal/Test_1 (OR)
FC = Test_1/Normal


Samples

Normal (baseline) = Normal
Test (Treated) = Test_1


Data

dput(df_data)
structure(list(Normal = c(0L, 184L, 60L, 0L, 7L, 0L, 87L, 0L,
0L, 21L, 193L, 29L, 0L, 0L, 3L, 50L), Test_1 = c(0, 140.5, 64,
0, 4, 0, 83, 0, 1, 51.5, 199, 25, 0, 0, 5, 62)), class = "data.frame", row.names = c("Gene1",
"Gene2", "Gene3", "Gene4", "Gene5", "Gene6", "Gene7", "Gene8",
"Gene9", "Gene10", "Gene11", "Gene12", "Gene13", "Gene14", "Gene15",
"Gene16"))

structure(list(SampleID = c("xxxx1", "xxxx2"), CoreLabID = c("Normal",
"Test_1")), class = "data.frame", row.names = c("Normal", "Test_1"
))


Here is the code that I am running

bcv <- 0.4
y <- DGEList(counts=df_data, group=df_metadata$CoreLabID) et <- exactTest(y, dispersion=bcv^2)  View(et$table)
structure(list(logFC = c(0, -0.67280976110796, -0.190706878123648,
0, -1.06592239047733, 0), logCPM = c(0.456451013758882, 6.84518828528986,
5.46338499556895, 0.456451013758882, 2.37389406911164, 0.456451013758882
), PValue = c(1, 0.433579402199822, 0.851984371429117, 1, 0.542580328250669,
1)), row.names = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5",
"Gene6"), class = "data.frame")

View(et$comparison) c("Test_1", "Normal")  Thank you, Toufiq R edgeR expression rna-seq differential • 980 views ADD COMMENT 2 Entering edit mode 19 months ago Gordon Smyth ★ 5.6k edgeR has compared Test_1 to Normal so, yes, the normal sample has been taken as the baseline. et$comparison records the numerator and denominator for the fold-change respectively.

You can find this out for yourself by reading the help page for exactTest, help(exactTest). See the documentation for the pair argument.

If you use the standard edgeR pipeline

topTags(et)


then the results will be presented in a easy to read form and the output will tell you which sample has been compared to which.

0
Entering edit mode

Gordon Smyth

Thank you very much for the quick response and suggestions. This is very helpful.

1
Entering edit mode
22 months ago

dge <- DGEList(counts)
dge <- calcNormFactors(dge)
my.cpm <- cpm(dge, log = F)
my.cpm.log <- cpm (dge, log = T, prior.count = 2)
allFC <- my.cpm[,1] / my.cpm[,2]
allLogFC <- my.cpm.log[,1] - my.cpm.log[,2]

0
Entering edit mode

@Michael Dondrup, thank you. This looks good!

Along with the fold change, I would also like to output the p-values (as additional column). In my question using edgeR, how does the fold change gets calculated in this package?

bcv <- 0.4
y <- DGEList(counts=df_data, group=df_metadata$CoreLabID) et <- exactTest(y, dispersion=bcv^2) View(et$table)
structure(list(logFC = c(0, -0.67280976110796, -0.190706878123648,
0, -1.06592239047733, 0), logCPM = c(0.456451013758882, 6.84518828528986,
5.46338499556895, 0.456451013758882, 2.37389406911164, 0.456451013758882
), PValue = c(1, 0.433579402199822, 0.851984371429117, 1, 0.542580328250669,
1)), row.names = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5",
"Gene6"), class = "data.frame")

View(et\$comparison)
c("Test_1", "Normal")

0
Entering edit mode

This is not the same as the edgeR logFC calculation.