I am attempting to use DiffBind to identify differential peaks between groups in my experiment. I have 10 groups in my experiment with 3 replicates per group. This allows for the possibility for 45 unique comparisons. Through the help of the DIffBind vignettes and online resources I have been able to produce code that is functional and returns a list of differential peaks for each comparison that I have established.
However, issues arose when I began to visualize this information in IGV. The differential peaks that were identified by EDGER and DESEQ2 were not supported by the individual peak files. Differential peaks were being called in locations where no peak was identified in any of the replicates in the first place and differential peaks were being called at locations where all replicated are demonstrating peaks with similar scores. I am using the same fasta file for visualization that was used for alignment and peak calling.
In the hopes of solving this issue, I attempted to look closely at only one of my comparisons. I created a data sheet that only contained the necessary information (7.bams [including input], 6 .narrowPeaks) for that comparison and reran the same code without making any notable modifications. This both reduced the number of differential peaks identified by both EDGER and DESEQ2 and lead to the identification of different differential peaks than when I had run the code such that it utilized a datasheet containing all of my sample information. When I visualize this information in IGV, it is appropriate/correct. There are differential peak calls at locations in which there are actual peaks identified in one group but not the other and so on and so forth...
I would like to better understand why this is occurring and if it is to be expected. My initial though would be that providing .bams and .narowPeak files for other groups should not change the output of each contrast, but that does not seem to be the case. Is it the result of a normalization step? Is there something that I can do to make the code that runs all 45 comparisons return the appropriate differential peaks that I was able to obtain by only looking at a single comparison. I would like to avoid having to run all 45 comparisons individually if at all possible.
- Samplesheet containing information for single contrast yields appropriate differential peaks
- Samplesheet containing information for all contrasts alters the differential peaks identified for each individual contract such that they appear incorrect upon visualization
I hope that this explanation makes sense. If further detail would be helpful, I am obviously more than happy to provide. Any assistance in addressing this question would be greatly appreciated as I would really prefer to not run each comparison individually.
All the Best!