What are some areas of bioinformatics in which people keep dishing out new papers on algorithms, overly complex statistical models, and tools where the old ones were good enough? What are areas where only marginal gains are to be found in terms of accuracy or speed, but people keep publishing because it's what they are familiar with?
I would steer away from predicting that any field only hold "marginal" gains.
Where I find that surprisingly little progress is made is when it comes to validation and properly documenting the strengths and weaknesses of various existing methodologies.
Take any tool that can assign RNA-Seq reads transcripts of a gene. How well do they work? Some isoforms may be very similar to others, other isoforms are easy to tell apart. But there is no way to tell which assignments are reliable which counts are more trustworthy than the others.
Here is another domain that I found to be surprisingly ill-documented in this respect. Take any tool that does metagenomic classification (Kraken, Centrifuge, Qiime etc) now run classification on data on just some specific species. What I see is that some species will be affected by major systematic errors that make the classification for that species alone incorrect (all the while the other counts are fine). On aggregate the method "works" (except all the errors come from a few species).
Same with differential expression, there is no method to check if the statistical methods are appropriate. People choose deseq or edger just because one seems to work "better" ... in the end an unscientific approach.
What I'd like is a tool that tells me, hey for this particular transcript, species, DNA etc the results you'll get are not so great.