DESeq data diagnosis problem (MAplot, dispersion)
0
1
Entering edit mode
7.1 years ago
Hydrangea ▴ 10

Hi, I used DESeq2 to analysis read count data using LRT test. I have multiple covariates in the GLM model. But my MAplot and dispersion plot do not look like typical plots in the manual. Also the histogram of raw p value have a high peak on 1.

Does it indicate a problem of model fitting? Do I need to discard the low-count transcript first? Also how "good" do the MAplot and dispersion plot need to be, in order to claim a properly fitted model?

Thanks,

https://s15.postimg.org/lojmvv863/maplot.png

https://s11.postimg.org/texjk1pkj/dispersionplot.png

https://s23.postimg.org/o7sr6nf6j/pvaluehist_Ink_LI.jpg

RNA-Seq • 2.4k views
ADD COMMENT
1
Entering edit mode

Did you perform any filtering to remove genes with low counts, or any other kind of quality filtering? How your MDS/ PCA look like? Describe your experiment design and analysis in more detail, please.

Anyway, Bioconductor support may be a better place for your question.

ADD REPLY
0
Entering edit mode

I used rowsum>0 only for speed up because DESeq2 has more restrict independent filtering. I'm using multivariate GLM model (comparing full vs reduced model with LRT test) using transcript level read count as outcome. The purpose is to find out transcript significant in LRT test. I didn't do any clustering yet.

ADD REPLY
0
Entering edit mode

Both of those plots look very odd. What are the scale factors?

ADD REPLY
0
Entering edit mode

The p-value distribution is the sort of thing one normally sees if there's an uncorrected batch effect. I suspect that a PCA plot will be informative.

ADD REPLY

Login before adding your answer.

Traffic: 2487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6