Question

Better DE analytic tools

0

Entering edit mode

7.2 years ago

mhyunjunkang ▴ 110

Dear all,

This may be basic question. But I would like to know or hear some comments and advice about DE analytic tools. FYI, I have very limited knowledge in this field. I keep studying, though.

I would like to know which DE analytic tool is the best to identify DE genes in RNA-seq data. Especially, I would like to know advantage, weak point, and difference between DE analysis based on negative binomial model and based on Bayesian empirical approach.

I'm not sure whether I can entirely understand your detail expertise. But I can at least figure out starting point.

Thanks in advance. HJ

RNA-Seq DE tools Bayesian empirical approach • 1.7k views

ADD COMMENT • link 7.2 years ago by mhyunjunkang ▴ 110

2

Entering edit mode

Just a quick comment: DESeq2 already includes an empirical Bayesian regression model with negative binomial family for the purposes of modelling and dealing with (adjusting for) dispersion. I believe it uses a Bayesian approach for the log2 fold change shrinkage too, which helps to deal with biased fold-changes at low counts. Take a look at my answer here: A: Clarification on how DSEeq2 Dispersion Curve is Generated

With regard to modelling RNA-seq as a negative binomial, it was shown that this resulted in less false positive associations than modelling it as a Poisson: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

Your question will likely attract much opinion.

ADD REPLY • link 7.2 years ago by Kevin Blighe 89k

3

Entering edit mode

I may link here an article from Mike Love (author of DESeq2) on the question if DESeq2 or edgeR (or a different method) is the gold standard for differential count analysis.

ADD REPLY • link 7.2 years ago by ATpoint 88k

2

Entering edit mode

Not just DESeq2, but edgeR and limma also have empirical Bayes components (e.g., variance shrinkage).

ADD REPLY • link 7.2 years ago by Devon Ryan 105k

0

Entering edit mode

Thank you all for all of the comments and expertise. I have been reading Mike Love's paper (DESeq2). I am still going back to the paper.

One question is how different some tools like EBSeq are from DESeq2. As I know EBSeq is also modeling RNA-seq as a negative binomial. Actually, I think that it is beta-negative binomial.

ADD REPLY • link 7.2 years ago by mhyunjunkang ▴ 110

1

Entering edit mode

On face value, they look quite similar in that they both assume a negative binomial distribution and do adjustments that help to adequately manage dispersion / variance. They also adjust for library size with size factors. The main difference may come in how they judge what is differentially expressed or not:

in DESeq2, final p-values are obtained via the Wald test applied to the final fitted negative binomial logistic regression model
in EBseq, I think that they do not use the Wald test. They appear to calculate posterior probabilities for each transcript and then gauge statistical significance in relation to differential expression via this metric

It would also help to read EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments

ADD REPLY • link 7.2 years ago by Kevin Blighe 89k

0

Entering edit mode

Thank you for the detail explanation. One small thing that makes me confused, though. As I understood, log link is used in DESeq2, not logit. Did I misunderstand? Again, thank you for your expertise. It really helps. HJ

ADD REPLY • link 7.2 years ago by mhyunjunkang ▴ 110

2

Entering edit mode

Correct, a logit link wouldn't make sense in RNAseq, which is why it's not used.

ADD REPLY • link 7.2 years ago by Devon Ryan 105k