Question: DESeq data diagnosis problem (MAplot, dispersion)
gravatar for Hydrangea
2.5 years ago by
Hydrangea10 wrote:

Hi, I used DESeq2 to analysis read count data using LRT test. I have multiple covariates in the GLM model. But my MAplot and dispersion plot do not look like typical plots in the manual. Also the histogram of raw p value have a high peak on 1.

Does it indicate a problem of model fitting? Do I need to discard the low-count transcript first? Also how "good" do the MAplot and dispersion plot need to be, in order to claim a properly fitted model?

MAplot dispersion Thanks,

hist of raw pvalue

rna-seq • 1.1k views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Hydrangea10

Did you perform any filtering to remove genes with low counts, or any other kind of quality filtering? How your MDS/ PCA look like? Describe your experiment design and analysis in more detail, please.

Anyway, Bioconductor support may be a better place for your question.

ADD REPLYlink written 2.5 years ago by h.mon27k

I used rowsum>0 only for speed up because DESeq2 has more restrict independent filtering. I'm using multivariate GLM model (comparing full vs reduced model with LRT test) using transcript level read count as outcome. The purpose is to find out transcript significant in LRT test. I didn't do any clustering yet.

ADD REPLYlink written 2.5 years ago by Hydrangea10

Both of those plots look very odd. What are the scale factors?

ADD REPLYlink written 2.5 years ago by Devon Ryan91k

The p-value distribution is the sort of thing one normally sees if there's an uncorrected batch effect. I suspect that a PCA plot will be informative.

ADD REPLYlink written 2.5 years ago by Devon Ryan91k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1776 users visited in the last hour