Question

Discrepancy Between Fragment Size Distribution and MACS3-Predicted Fragment Length in ChIP-seq Dataset

0

Entering edit mode

3 months ago

christoph.neu ▴ 10

Hi everyone,

I am analyzing an older ChIP-seq dataset where the DNA was sheared using an ultrasonic bath. Before library preparation, we assessed the DNA fragment size distribution using an Agilent High Sensitivity DNA assay (see the image below). The results indicate that most fragments are centered around 300 bp, with the smallest prominent peak around 230 bp.

Agilent High Sensitivity DNA assay

However, when I perform peak calling with MACS3 on this dataset, the predicted fragment length (D) reported by MACS3 is around 160 bp (see second image below), and MACS3 appears quite confident about this value.

macs3 output for predicted fragment size D

My first question is: Should the fragment size determined by the Bioanalyzer (Agilent assay) and the fragment length predicted by MACS3 be the same, or at least reasonably similar and not off by a factor of two?

And in the same vein, a follow-up Question: I have another sample where the fragment length predicted by MACS3 is less clear. In such cases, can the fragment size distribution measured by the Bioanalyzer be used to help determine or set the appropriate fragment size parameter for MACS3 peak calling?

Any insights or references are much appreciated!

macs3 • 770 views

ADD COMMENT • link 3 months ago by christoph.neu ▴ 10

1

Entering edit mode

Should the fragment size determined by the Bioanalyzer (Agilent assay) and the fragment length predicted by MACS3 be the same, or at least reasonably similar and not off by a factor of two?

Was there any additional size selection done during the library prep (e.g. bead washes). Since there is a distribution of insert sizes in the library, smaller library fragments tend to cluster more efficiently on the flowcell. This may be reflected in your results from MACS3.

ADD REPLY • link 3 months ago by GenoMax 154k

1

Entering edit mode

Two things: A) I would check the library in the Bioanalyzer rather than the chromatin, since the library prep might bias the distribution a bit (remember to subtract adapter length). And B), if you have a good indication from the lab on the fragment size then let macs know about it. Its guessing routines are (at least this is what I remember) relatively crude, so if you have real data on it, use it.

ADD REPLY • link 3 months ago by ATpoint 90k

0

Entering edit mode

The Bioanalyzer was used to check if a library preparation would be worth it, that's why we did it before hand. And I was expecting a bit of a different due biased efficiency's of different fragment sizes, just not something that pronounced. Apparently, I was wrong.

Thanks for your replies. I will check some more samples to see if the discrepancy is at least consistent.

ADD REPLY • link 3 months ago by christoph.neu ▴ 10