Question: What is the best way to plot distribution of peak heights?
0
gravatar for 1888
3.4 years ago by
188860
United States
188860 wrote:

I am sure this is been dealt before by Chip Seq experts... I was wondering what methods, preferably in R, are available to plot the distribution of the heights of Chip Seq peaks mapped on the genome. I see that there are several packages such as ChIPseeker and deepTools that will plot either heat maps or frequencies weighted by the height of the peaks, I guess this is done because reads are not uniformly distributed across the genome but why not plot directly the mean distribution of heights normalised by the total number of peaks in a histogram for example? What are the issues here that I am not seeing? 

Thank you very much for your insight

annotation chip-seq peaks ngs R • 1.9k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by 188860
0
gravatar for Devon Ryan
3.4 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Just use the hist() function. That's why we don't bother implementing something like this in deepTools.

ADD COMMENTlink written 3.4 years ago by Devon Ryan88k
0
gravatar for 1888
3.4 years ago by
188860
United States
188860 wrote:

Thank you for your reply Devon! You don't mean to use hist() directly right? I am not trying to map the frequencies but just to control for it, what I am trying to get is a plot of the value (y-axis) with respect to distance from start site (x-axis). I must have to take account of the number of peaks otherwise the signal would be driven by only for example one peak in some regions and many in others, binning could work but it seems hard to find the correct number of bins since it varies a lot. Thanks

ADD COMMENTlink written 3.4 years ago by 188860

With deepTools profile tool you can plot the standard error as well as the mean if this is what you want. But, this is a different question than plotting a histogram of peak coverage. Furthermore, peak height varies depending on genome biases as well as for biological reasons, thus, a normalized 'peak height' based on the 'input' sample is desirable. If you have, for example, the MACS2 peak output, you can easily in R call the hist() function as suggested by Devon on the fold change column. 

ADD REPLYlink written 3.4 years ago by Fidel1.9k

Hi Fidel, thank you for your reply! Yes what I would like is not a histogram of peak coverage, but it is the mean peak height normalized by the coverage. Maybe I can simply do this in R by binning the distance from transcription start site, finding the mean height for each bins, and plotting that as peak height corrected by the coverage. Is this wrong? How else would you see how the peak height varies depending on position on the genome (i.e. near or far from transcription start sites)? Thanks again for your input!

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by 188860
1

If I understand you correctly, what you can do is use closestBed from bedTools. This program will give you the distance between peak and TSS. Then, you can plot peak fold change (or some of the other normalized values from MACS2) vs. distance.

 

ADD REPLYlink written 3.4 years ago by Fidel1.9k

Hi, thank you for the help, I really appreciate it as I am a bit lost here. I'm actually already using closestBed to map the peaks in regards to the TSS, but I now have many peaks piling up very close to the TSS and few further away, so my question is how do I normalize the output from MACS2 in regards to distance from TSS? I am asking this because when I tried binning this distance from TSS and taking the mean of the MACS2 values for each bin, I got a plot that just has one huge spike close to the TSS no matter how small the bins were (I have it down to 1 bp), so it seems like the wrong approach...

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by 188860
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2063 users visited in the last hour