I have read that when one encounters a confounding variable, the preferred method to account for their effects would be to add this variable to the design formula but not remove them using limma's removebatcheffect function.
My questions on this topic are:

should I remove batch effect using limma's removebatcheffect function if the confounding effect is large and inconsistent as suggested here (https://support.bioconductor.org/p/125386/#125387). For example, when one can see in a pca plot and heatmap that samples still cluster according to the presence or absence of a confounding variable even though it has been added to the design formula (eg when samples cluster by sex after it has been added to the design formula).

should one use the transformed counts after removal of a confounding effect or the untransformed counts (and only add the confounding variable to the design formula) for heatmaps, pca plots and DE analysis?

Confounders should be modeled where that is possible, not removed. This particularly applies to DE analysis, or anything that relies on count-based statistics, like the Poisson or Negative Binomial distribution.

Where you are not modeling the data, but just presenting it - like in heapmaps or pca, then modeling the confounder isn't possible, and it should be removed.

Where the counfounder is known, you can directly use it in the design formula, or remove it using limma::removeBatchEffects. Where the confounder is unknown, then you must identify it using something like SVA.

False dichotomy (sort of - both concepts exist sure but one subsumes the other in effect).

In a general linear model, the explanatory value of the model is the ratio of the sums of squares accounted for by the model to the total sums of squares.

If you have something like Y = B0 + B1X1 + B2X2 + B3X3 + E

and you generate an effect size estimate (beta) for each term, nothing prevents you from subtracting the remaining predictors (other than the explanatory variable(s)) out.

Thus, while you are controlling for the covariates, in the sense you can just subtract them with respect to the final explanatory value of the model (Explained SS / Total SS), you can both model covariates and remove their effects.

But, the general principle is you never discard data. Rather, model it as a covariate. For any downstream application you have that might benefit from removal (e.g. a picture before and after controlling for Age), you can in effect remove them anyway. But you do that by storing the data and doing each aspect of the analysis and visualization thoughtfully.