To expand on what ATPoint says - DESeq/edgeR/limma work by using emprical bayes to share information between genes/probes/regions to account for the low number of samples. In many cases, this approach inspired by this can be used in many problems when you have a low number of samples. Indeed, while limma is documented as a system for doing microarray/RNA-seq analysis, the core DE engine in limma is a system for doing moderated t-tests that should be applicable to any situation where you wish to do a large number of t-tests/linear models, each with a small number of replicates, where you have reason to believe that the tests are informative of each other.

You will often find that for things like ChIP-seq and CLIP-seq and Methylation and Editing that while results are reported at the level of individual sites, generally conclusions are drawn from averages across many sites. You might find results that say such and such a TF binds to this *type* of site or that binding sites upstream of this *category* of gene change. When you do this sort of analysis, the accuracy of calls at individual sites/genes is less important, as long as the error is unbiased.

For RNA editing, you could try limma. How successful this is will probably depend on the resolution you are aiming for. Because you could look at individual bases. But also, RNA editing tends to come in clusters, so you could look at windows, and try to find regions of differentially edited bases.

If you want to look at individual bases, then you might want to investigate empirical Bayes with the beta binomial model, as explained (with base-ball statistics) here: http://varianceexplained.org/r/empirical_bayes_baseball/

To clarify: it's not the negative binomial model that fixes the problems with low number of replicates.

The problem with low number of replicates (e.g. n=2) is that estimating the variance is difficult. Therefore, DEseq and other methods (limma/sleuth/cuffdiff/edgeR/etc.) all perform an Empirical Bayes method called shrinkage to get better estimates of the variance. This method has been used since the microarray days and is designed to handle the issue with low number of replicates.

The trick here is to "share information across genes". In a 2 vs 2 setup you do not have four but 4*n_genes datapoints. By modelling the trend of the variance between samples of the same group across the mean (so for every possible average expression level) you get a fairly decent estimate of the expected variance. With expected variance one can then help decide if observed variance qualifies as a DE gene or is likely just a reflection of the experimental noise. The distribution does not matter, for example limma does not use the NB, it just happened that people realized RNA-seq can be decently modelled with NB.