For context: I have 4 different datasets of RNA-seq Illumina data (let's call them A, B, control A and control B). I know that cells from condition A and B produce a certain metabolite. The approach will be to determine DE genes in A_vs_controlA and B_vs_controlB and see which are common. However, I only have 2 replicas of control B (which has a different origin compared with control A), however, I believe each of these samples came from a library of several different individuals (which I´m not sure if it is that relevant). I know that statistically, I need to have at least 3 replicas but for several reasons, there is an impossibility of obtaining more data right now.
What are some approaches I can make to make my inferences more "robust"? Should lower the adjusted p-value threshold to be more restrictive? Should I simulate data based on my 2 samples?