I'm currently analyzing a dataset containing 6 biological replicates of two conditions: 6 Condition1 vs 6 Condition2. Experiments were done some time ago in 3 different batches, e.g. 3 different days. Basically, I call peaks separately for every sample (n =12) using Input of that batch as a control. Then, I use DiffBind 3.0 to detect common peakset and find differentially bound peaks.
So the problem I'm stuck with - when I use all the available replicates - I get too few differentially bound peaks (11!). Combining replicates from different batches helps to increase number of diff-bound regions (though not sure which part of that is due to batch-effect).
I would really appreciate your tips on the following:
1) How can I pick proper replicates for DiffBind analysis? (in case ChipSeq fingerprint plots look similar for majority of replicates); 2) Is it appropriate to use samples from different batches- like 1+2+3 for Condition1 and 1+4+5 Condition2? Maybe I need to include multi-factor design in DiffBind package to account for my batch effect?
Thanks in advance,