Hi everyone,
I am analyzing an older ChIP-seq dataset where the DNA was sheared using an ultrasonic bath. Before library preparation, we assessed the DNA fragment size distribution using an Agilent High Sensitivity DNA assay (see the image below). The results indicate that most fragments are centered around 300 bp, with the smallest prominent peak around 230 bp.
However, when I perform peak calling with MACS3 on this dataset, the predicted fragment length (D) reported by MACS3 is around 160 bp (see second image below), and MACS3 appears quite confident about this value.
My first question is: Should the fragment size determined by the Bioanalyzer (Agilent assay) and the fragment length predicted by MACS3 be the same, or at least reasonably similar and not off by a factor of two?
And in the same vein, a follow-up Question: I have another sample where the fragment length predicted by MACS3 is less clear. In such cases, can the fragment size distribution measured by the Bioanalyzer be used to help determine or set the appropriate fragment size parameter for MACS3 peak calling?
Any insights or references are much appreciated!
Was there any additional size selection done during the library prep (e.g. bead washes). Since there is a distribution of insert sizes in the library, smaller library fragments tend to cluster more efficiently on the flowcell. This may be reflected in your results from MACS3.
Two things: A) I would check the library in the Bioanalyzer rather than the chromatin, since the library prep might bias the distribution a bit (remember to subtract adapter length). And B), if you have a good indication from the lab on the fragment size then let macs know about it. Its guessing routines are (at least this is what I remember) relatively crude, so if you have real data on it, use it.
The Bioanalyzer was used to check if a library preparation would be worth it, that's why we did it before hand. And I was expecting a bit of a different due biased efficiency's of different fragment sizes, just not something that pronounced. Apparently, I was wrong.
Thanks for your replies. I will check some more samples to see if the discrepancy is at least consistent.