DESeq2-Differential expression Number limitation
1
0
Entering edit mode
3.7 years ago
Rob ▴ 170

Hi Friends, How many genes can be used for differential expression analysis with DESeq2? I have 60000 genes but it did not give me result for all genes. What should I do?

rna-seq • 1.2k views
ADD COMMENT
0
Entering edit mode

What result are you expecting? DESeq2 will give you a list of differentially expressed genes and statistics on them.

ADD REPLY
0
Entering edit mode

No DESeq2 could not work for 60 thousands gene. for some results are as NA. also other for other genes no significant p-adjust.

ADD REPLY
0
Entering edit mode

To follow up on RamRS's comment, I think it would be more helpful if you provided more information regarding your DESeq2 run (i.e. samples, groups, design formula). You can try to check this section of the DESeq2 vignette regarding the NA values.

ADD REPLY
0
Entering edit mode

I used this code and none of the genes were significant differential expression. So, I think this cannot calculate differential expression for larg number of genes. Is there any suggestion?

rdata <- read.table("myData.txt", header = TRUE, row.names = 1)

library(DESeq2)

## Differential abundance
alpha <- 0.05 #set the cutoff value

## Create metadata - got this info from the first line of the raw data file
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22)))
colnames(sample_org) <- c("Group")

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = sample_org,
                              design = ~Group)

dd <- DESeq(dds)
res <- results(dd)

write.csv(res,"res.csv")
ADD REPLY
0
Entering edit mode

Some quick points:

  • you are not doing lfc shrinkage - see the Quick start and Log fold change shrinkage for visualization and ranking
  • please read the section on NA p-values, HERE
  • You create a variable, alpha, but then never use it anywhere. If unsure, leave values in DESeq2 functions at their default
  • please perform some pre-filtering on your raw counts for low-expressed genes (although this is not necessary, as these are the very genes that will be more likely to have NA p-values)
ADD REPLY
1
Entering edit mode
3.7 years ago
ATpoint 82k

It works on an arbitrary number of genes. I used it for setups with > 100.000 regions before. Please read the DESeq2 manual towards why NAs appear in the results. Having non-significant p-adjust is expected, the results object will not only contain significant but all genes.

ADD COMMENT
0
Entering edit mode

Thanks Yes, non sig genes are expected but not all genes. in my case all genes showed padjust of 0.9 and off course it is wrong. Any solution?

ADD REPLY
2
Entering edit mode

Why is this wrong? You can either have no DEGs because there are none (in the biological reality) between conditions or your study is underpowered or variation between replicates is too large, so then this is a totally valid results.

ADD REPLY
0
Entering edit mode

I would not say this is wrong, but perhaps a little bit worrying about the data quality. The study does not see underpowered in terms of replicates (22), but perhaps the read counts are very low ? You can easily check that using a MAplot. Or perhaps there is a huge variability between replicates ? Check the raw and DESeq2-normalized data (counts(dds, normalized=T)) for a few highly expressed genes and see if that makes sense. You could also do a principal component analysis and assess whether the replicates cluster together and what percentage of the total variability is associated with that.

ADD REPLY
0
Entering edit mode

It would indeed be good to get some more details. If this is a cell line experiment n=22 is awesome, if this is e.g. a patient cohort investigating gender-specific drug response n=22 it is probably not enough.

ADD REPLY
0
Entering edit mode

Hi Thank you they are patients. 22 patients vs 22 control. the data is HT-Seq read counts

ADD REPLY
0
Entering edit mode

Hi I added counts(dds, normalized=T after running in R studio it showed error as:

Error in .local(object, ...) : first calculate size factors, add normalizationFactors, or set normalized=FALSE

ADD REPLY
0
Entering edit mode

There are multiple fix to this, but in your case, the easiest would be to call the count() function after the DESeq() fonction. So counts(dd, normalized=T) should work.

ADD REPLY
0
Entering edit mode

I did but I get this:

Error in .local(object, ...) : first calculate size factors, add normalizationFactors, or set normalized=FALSE

ADD REPLY

Login before adding your answer.

Traffic: 2601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6