Question

DESeq2 Dispersion Estimates Curve Poor Fit?

1

Entering edit mode

3.3 years ago

lransom ▴ 20

Plot Disp Ests PlotDispEsts returns this plot after running DESeq on my RNA Seq dataset. It looks somewhat different than example plots I see online - many of the points fall below the line of the expected dispersion for genes at a given expression level. I thought the line should be fitted to my data, but it doesn't look like a very good fit to me. Has anyone seen anything like this before? I'm new to RNA Seq analysis and am wondering if I should be concerned by this plot.

Thanks for your help!

rna-seq • 2.8k views

ADD COMMENT • link updated 3.2 years ago by ponganta ▴ 590 • written 3.3 years ago by lransom ▴ 20

1

Entering edit mode

DispEstsLocal

After looking into it some more, I tried setting the DESeq parameter fitType=local, and my dispersion plot now looks like this. The curve certainly appears to fit the data more closely.

Could anyone give a description in layman's terms of the differences between the parametric fit (the default) and the local fit? Just want to make sure I'm proceeding with sound logic.

Thanks!

ADD REPLY • link 3.3 years ago by lransom ▴ 20

0

Entering edit mode

Can you add code and some context, so what are the samples, in terms of groups, and how many replicates, which design did you choose?

ADD REPLY • link 3.3 years ago by ATpoint 82k

0

Entering edit mode

Sorry, just fixed the image link. I'm trying to compare RNA from exosomes in Disease vs Heathy. I have 12 biological replicates of each disease state. I created the counts using feature counts, and the design for the construction of my DESeq dataset was: ddsMatDisease=DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ Disease)

Does that help?

ADD REPLY • link 3.3 years ago by lransom ▴ 20

0

Entering edit mode

Did you quantify via Salmon? I have seen a similar fits when using DESeq2 on transcript-level counts without aggregating them to gene level.

ADD REPLY • link 3.2 years ago by ponganta ▴ 590

score 1 · Answer 1 · 2021-01-26

The local dispersion fit fits a function of the form f(x) = a/x + b to the mean-of-counts vs. dispersion point cloud (equation 6 in the DESeq2 paper), while the local fit uses the locfit package to compute a moving average and is thus more flexible to fit mean-dispersion trends that do not fit the parametric form described above. From looking at quite a few mean-dispersion trends, yours looks quite strange.

It's probably best to ask this question in the Bioconductor forum and use the DESeq2 tag to ask Mike Love himself what he thinks of your dispersion trend. Have you filtered the data appropriately as suggested here?