Question: Chip-seq data with low enrichment
gravatar for srhic
13 days ago by
srhic30 wrote:


I am working with some chip-seq data for a broad epigenetic mark and want to perform some peak calling analysis. From some initial qc analysis based on deeptools fingerprint plots, it seems my chip efficiency was not very good and there is a lot of background (but I have also read that this may be normal for broad marks?).

My initial peak calling with macs2 and homer using pretty liberal parameters yielded much fewer peaks than expected (~5000 compared to ~20000 in literature). My first question is if it would be ok for me to proceed with the analysis of these peaks? In other words, is it normal to see such big differences between my data and published results for the same mark? Are there any other things I can try to get more peaks or reduce background?

Alternatively, if this is because of low chip efficiency, would it make sense if I don’t do a peak calling analysis but still use the data by quantifying the signal at specific regions such as promoters and maybe do some clustering (both deeptools and homer have some good tools for this)? I am comparing the occupancy of this mark between two different conditions so I am assuming both conditions would be equally affected by the low efficiency.

Any tips would be appreciated.


chip-seq • 73 views
ADD COMMENTlink written 13 days ago by srhic30

Hi, can you post the commands you used for calling the peaks?

ADD REPLYlink written 13 days ago by rpolicastro3.3k

Is this a downloaded dataset and with "literature" you mean the paper that published it had 20k peaks versus 5k that you get or did you generate the data? In general (in my experience) without code any number that a paper reports is not much worth, meaning that by changing lowlevel processing and thresholding you can get from a given dataset basically any number of peaks and it even gets more variable depending on how to assess "reliable peaks" between replicates with methods such as IDR which again have thresholds etc one can change. Without code barely reproducible. If you generated the data yourself then you can also have tremendously different results depending on antibody, protocol and/or processing. ChIP-seq is a pain and heavily depends on the circumstances. Comparing with published datasets is very cumbersome.

ADD REPLYlink modified 13 days ago • written 13 days ago by ATpoint44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2753 users visited in the last hour