I am trying to perform differential binding analysis of ChIPSeq peak data with DiffBind.
Input reads were mapped vs GRCm38.p5 (containing alt-loci, patches & scaffolds) and peaks were called with MACS2 (alt-loci are of particular interest here, because mouses used in this study are not BL6 on one particular locus, locus which happen to have an alt-sequence in GRCm38.p5, so potentially lots of reads mapping on this alt-locus & DE binding between KO & WT).
Then MACS2 output is loaded in Diffbind, analysis steps are shown below :
Reading in peaksets (dba)
Counting reads (dba.count)
Differential binding affinity analysis (dba.contrast, dba.analyze)
After DE analysis, I try to report results as a GRanges object (dba.report)
test.DB <- dba.report(test) test.DB
GRanges object with 40 ranges and 6 metadata columns:
seqnames ranges strand Conc Conc_MAR Conc_WT
81 chr12 [115603698, 115604198] * 5.88 6.61 -2.02
309 chr6 [103648967, 103649467] * 8.3 -1.12 9.62
seqinfo: 19 sequences from an unspecified genome; no seqlengths
40 differentially bound sites were found, but it seems that I lost an additional 19 differentially bound sites because they occur on sequences with names such as "GL456385.1", "JH584299.1", "KQ030495.1", ...
I did check that my sequences names are the same in BAM & MACS2 output files.
How can I keep DE sites on alt-loci / patches / scaffolds in my results ?