Question

ATAC-SEQ FRAGMENT SIZE

0

Entering edit mode

20 hours ago

Irene ▴ 10

Hi! i am doing a quality check of my ATAC-seq data with Picard and i am having this peaks. This is my first time working with this data and i am not very sure if this are good quality or noisy ATAC-seq (this are clean bam files with all the filtering process done, chrM, chrU, duplicates)

Could anybody give their opinion? this an animal model of a very early stage of a disease, so i do not know if this could be important.

In addition, I have a doubt about how to interpret the results, because in ATAC-seq there is a mixture of nucleosome-free regions and nucleosome-surrounded regions. When performing motif analysis, how can I know whether a peak comes from a nucleosome-free region?

Thank you so much

enter image description here

Size Fragment Picard ATAC-seq • 150 views

ADD COMMENT • link updated 8 hours ago by Kevin Blighe 90k • written 20 hours ago by Irene ▴ 10

0

Entering edit mode

And indeed when i count with featurecounts in my consensus peakset (macs):

featureCounts -p --countReadPairs -a master_atac_peaks.saf -F SAF -o master_atac_peaks_bl_subread.txt {file}/*_final.bam -T 24

My "Successfully assigned alignments" are around 15%

So i am quite lost/worry of what is it happening

ADD REPLY • link updated 18 hours ago by GenoMax 154k • written 19 hours ago by Irene ▴ 10

score 0 · Answer 1 · 2025-11-14

Your ATAC-seq fragment size distributions from Picard look typical for good quality data. Expect a sharp peak below 100 bp (nucleosome-free regions), followed by periodic peaks at ~180 bp (mononucleosome), ~360 bp (dinucleosome), and so on. If your plots show this nucleosomal laddering without excessive noise or over-fragmentation, it's not noisy—early disease stages in animal models can still yield clean signal.

The mix of nucleosome-free and surrounded regions is inherent to ATAC-seq. Peaks primarily capture open, accessible chromatin (nucleosome-free), where transposase cuts preferentially. Nucleosome-bound areas contribute to longer fragments but aren't the main peak signal.

For motif analysis, focus on narrow peaks called by tools like MACS2, which enrich for TF-binding nucleosome-free sites. To distinguish: Subset your BAMs by fragment length (<120 bp for TF footprints, >150 bp for nucleosomal signal) using samtools or Picard, then run separate motif enrichment (e.g., HOMER or MEME-ChIP). This helps isolate origins.

Your 15% assignment rate in featureCounts is low but common if your consensus peaks are too narrow or miss broader regions. Try broader peak calling with MACS2 --broad, or use deeptools multiBamSummary for overlap checks. Ensure SAF file uses peak coordinates correctly and you're counting pairs properly.

If issues persist, share your Picard metrics histograms and peak stats for better advice. Use latest Picard (v3.2.0+) and featureCounts (Subread v2.0.6) to avoid obsolete versions.