For why these genes get called differential you have to understand DESeq2's approach to estimating per-gene dispersion.
The raison d'etre of DESeq2 and edgeR is to try to find accurate estimates for the each gene's dispersion with only a small number of replicates. They do this by sharing information between genes on the understanding that genes of a similar expression level are, on average, going to have a similar dispersion. Thus they estimate average dispersion for all the genes and fit a smoothed function to the mean-expression vs dispersion estimate. This represents their expected dispersion. Then for each gene they "shrink" that genes dispersion estimate towards the expected dispersion using a baysian proceedure. This means that genes with a small dispersion will have their dispersion increased and are therefore less likely to be significant, and for genes with a large dispersion, like these ones here, will have their dispersion decreased, making it more likely they will be significant.
In their paper they Love et al state:
If, on the other hand, an individual gene’s dispersion is far above the
distribution of the gene-wise dispersion estimates of other genes,
then the shrinkage would lead to a greatly reduced final estimate of
dispersion. We reasoned that in many cases, the reason for
extraordinarily high dispersion of a gene is that it does not obey our
modeling assumptions; some genes may show much higher variability than
others for biological or technical reasons, even though they have the
same average expression levels. In these cases, inference based on the
shrunken dispersion estimates could lead to undesirable false positive
calls. DESeq2 handles these cases by using the gene-wise estimate
instead of the shrunken estimate when the former is more than 2
residual standard deviations above the curve.
My guess is that this safety-value is "failing" in your case. This is probably because there is a large standard deviation in the distances of the gene-wise dispersion estimates from the expected dispersion (due to only having two reps), thus the large dispersions seen here would be less than 2sd from the expected curve and the shruken value for dispersion is used.
If you want to avoid this, you could use DESeq (rather DESeq2) and set the sharingMode="maximum". Here the maximum of the trend and the per-gene dispersion estimate is used, which is a more conservative approach (DESeq was way more conservative than other methods of its time).