1) sample 1 seems really bad, with lots of technical replicates, sample 2 seems fine, and sample 3 seems to have a moderate to high level of technical duplicates.
2) The Libraries can contain technical duplication post is very clear about advising not to remove duplicates from RNAseq, and not to remove duplicates from samples with different levels of technical duplicates.
[...] to completely remove the duplication. The problem with this approach is that it isn’t able to distinguish biological from technical duplication and both are removed. In samples where an even read coverage is expected and the depth of sequencing hasn’t come close to saturating this then this is a reasonable approach, but in samples with variable read densities this will have the effect of capping the maximum read density able to be obtained, and limiting the dynamic range able to be obtained. If multiple samples with different amounts of technical duplication are deduplicated in this way then you will actually introduce differences where the don’t exist.
3) I have no idea.
An additional consideration: in my experience, when samples subjected to the same treatment have different levels of technical duplicates, it means problems with RNA extraction or library preparation, meaning the technical duplicates are just the observable side of a deeper problem. If follow up with downstream analyses like gene quantification and differential expression, when performing a PCA the samples with high proportion of technical duplicates will not cluster together with other samples from the same treatment, but will probably be scattered all over.