Hi,
We have run a pilot RNA-Seq study with one sample per condition, this is just a test run. I understand there is no valid statistical test in this case, however just curious to obtain differential expression through edgeR package in R assuming dispersion = 0.4 for the human data. I have a normal (baseline) sample followed by 5 different samples. When I run the edgeR package, I want to indicate my normal sample (as baseline), however, I am unsure what sample here is taken as a baseline for calculation of fold change.
FC calculation
FC = Normal/Test_1 (OR) FC = Test_1/Normal
Samples
Normal (baseline) = Normal Test (Treated) = Test_1
Data
dput(df_data)
structure(list(Normal = c(0L, 184L, 60L, 0L, 7L, 0L, 87L, 0L,
0L, 21L, 193L, 29L, 0L, 0L, 3L, 50L), Test_1 = c(0, 140.5, 64,
0, 4, 0, 83, 0, 1, 51.5, 199, 25, 0, 0, 5, 62)), class = "data.frame", row.names = c("Gene1",
"Gene2", "Gene3", "Gene4", "Gene5", "Gene6", "Gene7", "Gene8",
"Gene9", "Gene10", "Gene11", "Gene12", "Gene13", "Gene14", "Gene15",
"Gene16"))
dput(df_metadata)
structure(list(SampleID = c("xxxx1", "xxxx2"), CoreLabID = c("Normal",
"Test_1")), class = "data.frame", row.names = c("Normal", "Test_1"
))
Here is the code that I am running
bcv <- 0.4
y <- DGEList(counts=df_data, group=df_metadata$CoreLabID)
et <- exactTest(y, dispersion=bcv^2)
View(et$table) structure(list(logFC = c(0, -0.67280976110796, -0.190706878123648, 0, -1.06592239047733, 0), logCPM = c(0.456451013758882, 6.84518828528986, 5.46338499556895, 0.456451013758882, 2.37389406911164, 0.456451013758882 ), PValue = c(1, 0.433579402199822, 0.851984371429117, 1, 0.542580328250669, 1)), row.names = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5", "Gene6"), class = "data.frame") View(et$comparison) c("Test_1", "Normal")
Thank you,
Toufiq
Gordon Smyth
Thank you very much for the quick response and suggestions. This is very helpful.