Entering edit mode
4.7 years ago
Rob
▴
170
Hi friends I use DESeq2 for differential expression analysis for HTSeq read count data with about 20000 genes. my genes have ENSEMBL ID. I have 44 patients (2 groups each with 22 patients) The result shows p-adjust all 0.9. Why is that? How can I fix it?
The code I use is this:
## Reading in raw data
rdata <- read.table("my_data.txt", header = TRUE, row.names = 1)
library(DESeq2)
## Create metadata
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22)))
colnames(sample_org) <- c("Group")
dds <- DESeqDataSetFromMatrix(countData = rdata,
colData = sample_org,
design = ~Group)
dd <- DESeq(dds)
res <- results(dd)
write.csv(res,"res.csv")
Also in this line of code I get this warning message from Rstudio but I do not think it affects my result:
dds <- DESeqDataSetFromMatrix(countData = rdata,
colData = sample_org,
design = ~Group)
Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
some variables in design formula are characters, converting to factors
Thanks Michael In your code I got no sig genes for the res but for the res.example I got around 7000 genes sig FDR<0.05. I did not understand why? Can I use this result or this is just example and not real result for my data? I did not understand why I have no sig genes in res.
res.example is synthetic simulated data, you should get DEG's with the settings. I have just introduced this as a quick test that the setup is ok because I don't have your data. If you don't get any gene with FDR < 0.05 that could mean there aren't any, at least none DESeq2 can detect. Certainly, this is disappointing because of the large number of samples. Most current methods are still being developed towards small sample sizes.
Reasons could be that the overall biological variability is too high, the effect size is extremely small (like comparing an ineffective drug treated group with a placebo group), up- vs. down-regulation is imbalanced or the groups are not properly set up. You could also try different methods such as edgeR.rb. Have a look at review papers like this one: https://doi.org/10.1371/journal.pone.0232271 and try to select a method with higher power on large sample size.
Still, the most likely explanation remains that you got no DEG's because there simply aren't any.