DESeq2 for Differential expression Ensembl gene ID
1
0
Entering edit mode
4.7 years ago
Rob ▴ 170

Hi friends I use DESeq2 for differential expression analysis for HTSeq read count data with about 20000 genes. my genes have ENSEMBL ID. I have 44 patients (2 groups each with 22 patients) The result shows p-adjust all 0.9. Why is that? How can I fix it?

The code I use is this:

## Reading in raw data 
rdata <- read.table("my_data.txt", header = TRUE, row.names = 1)

library(DESeq2)

## Create metadata 
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22)))
colnames(sample_org) <- c("Group")

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = sample_org,
                             design = ~Group)

dd <- DESeq(dds)
res <- results(dd)
write.csv(res,"res.csv")

Also in this line of code I get this warning message from Rstudio but I do not think it affects my result:

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = sample_org,
                             design = ~Group)


Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  some variables in design formula are characters, converting to factors
RNA-Seq • 1.1k views
ADD COMMENT
0
Entering edit mode
4.7 years ago
Michael 55k

Could you try the following shorthand code:

# generate some dummy data to check if the setup is sane 
dds.example <- makeExampleDESeqDataSet(n=nrow(rdata), m=44, betaSD = 1)


## Create metadata 
groups <- factor(c(rep("0", 22), rep("1", 22)))

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = DataFrame(groups), design = ~ groups)
dds.example <- DESeqDataSetFromMatrix(countData = counts(dds.example),
                              colData = DataFrame(groups), design = ~ groups)


dds <- DESeq(dds)
dds.example <- DESeq(dds.example)

res <- results(dds, contrast=c("groups", "0","1"))
res.example <- results(dds.example, contrast=c("groups", "0","1"))

print (paste ( "Found ", sum(res$padj < 0.05), "significant in my experimental data" ))
print (paste ( "Found ", sum(res.example$padj < 0.05), "significant in the synthetic data at dispersion = 1" ))

If there are still no significant genes, then you might have to try something else. There are cases where there is no significant effect.

ADD COMMENT
0
Entering edit mode

Thanks Michael In your code I got no sig genes for the res but for the res.example I got around 7000 genes sig FDR<0.05. I did not understand why? Can I use this result or this is just example and not real result for my data? I did not understand why I have no sig genes in res.

ADD REPLY
0
Entering edit mode

res.example is synthetic simulated data, you should get DEG's with the settings. I have just introduced this as a quick test that the setup is ok because I don't have your data. If you don't get any gene with FDR < 0.05 that could mean there aren't any, at least none DESeq2 can detect. Certainly, this is disappointing because of the large number of samples. Most current methods are still being developed towards small sample sizes.

Reasons could be that the overall biological variability is too high, the effect size is extremely small (like comparing an ineffective drug treated group with a placebo group), up- vs. down-regulation is imbalanced or the groups are not properly set up. You could also try different methods such as edgeR.rb. Have a look at review papers like this one: https://doi.org/10.1371/journal.pone.0232271 and try to select a method with higher power on large sample size.

Still, the most likely explanation remains that you got no DEG's because there simply aren't any.

ADD REPLY

Login before adding your answer.

Traffic: 6900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6