Question

Deeptools PlotProfile

0

Entering edit mode

5 weeks ago

Irene • 0

Hi everyone, I’m running a ChIP-seq analysis and visualizing the data. I have two biological replicates in two conditions, WT and HTT. I merged the BAM files with samtools (samtools merge wt1.bam wt2.bam & samtools merge htt1.bam htt2.bam). Then I ran bamCoverage using RPKM for normalization. After that, I built the count matrix using a BED file with my regions of interest and, finally, generated this plotProfile.

The issue is that each condition starts at a very different baseline on the y-axis (WT above 8 and HTT above 7), whereas in publications I see the signal starts close to y = 0 and all conditions share the same starting value—unlike my case, where one is higher than the other. What could be causing this, and how is it recommended to fix it?

PS: I have the input ChIP-seq files, but I’m not sure whether I should do a bamCompare instead of a bamCoverage—e.g., bamCompare WT WT-input—and then merge the biological replicates afterward. I don’t know if that makes sense, or if the input isn’t necessary for visualization.

Thank you very much!

plotprofileBAMcoverate

computematrix deeptool heatmap chip-seq bamcoverage • 2.3k views

ADD COMMENT • link updated 5 weeks ago by ATpoint 89k • written 5 weeks ago by Irene • 0

score 2 · Answer 1 · 2025-08-29

There is probably different levels of signal-to-noise, which is one of the situations where these simple per-million scalings such as RPKM implemented in bamCoverage fails to properly align the baselines. This is why I personally always go the more tedious part to first build a count matrix, run this through edgeR and then calculate a more accurate size factor to scale by bigwigs by, see for code and discussion: ATAC-seq sample normalization