Exon Center signal maps (metagene plots) using deepTools
1
0
Entering edit mode
7.3 years ago

Hello everyone,

I am attempting to create a metagene plot of Pol II on the center of exons. I am using deepTools to do this. I have a bed file with my region (exon) information, and a bigwig (and bam) file of Pol II scores. DeepTools has two modes ... reference-point, and scaled ... I am unsure of which to use and why. I am currently using reference-point center as that is what I am looking for, but I am aware that exons are of different lengths so they must be normalized.

What parameters should I should and why exactly?

PS: In the heatmaps generated, I am not entirely sure what the colorbars represent. Is it how strong the signal of a tf is?

ChIP-Seq metagene deepTools • 3.3k views
2
Entering edit mode
7.3 years ago

Just use scale-regions. Regarding the coloring, yes, it represents the signal intensity (in this case, coverage).

0
Entering edit mode

If I use scale regions should I leave the default --regionBodyLength? Exon's arent particularly large, but I can't seem to find enough information for a correct assumption here.

0
Entering edit mode

The default is normally fine, at the end of the day it's just a scaling factor.

0
Entering edit mode

Is it possible to plot two transcription factors into one plot or would I have to do an individual plot for each?

1
Entering edit mode

Yes, though you'd need to use the release-1.6 branch (this hasn't been merged into the master branch yet, but will be done so within the next couple of weeks).

0
Entering edit mode

I had another question @Devon .. I went ahead and sorted my heatmap for Pol II, is there anyway to base another heatmap on the first? For example, i'm interested in generating a heatmap for KAP1, but I want to be able to compare them side by side so this means that the exon positions in the Pol II heatmap and the KAP1 heatmap must be the same. Is this something that's done by default? Or would I use something like --sortUsing region_length as a parameter for the heatmapper option? Thanks in advance.

2
Entering edit mode

Two options:

• You run heatmapper to generate the plot for PolIl using the option to save the bed file containing the regions in the order in which they are plotted (--outFileSortedRegions). This bed file can then be used with computeMatrix for KAP1 (be sure to turn off sorting to get exactly the same order). The default sorting of heatmapper is based on the average value over the exon in decreasing order. Sort using region length is a different way to assure that both heatmaps generated have the same sorting.
• Use computeMatrix from the development branch (called release-1.6) that can accept multiple bigwig files as input which can later be displayed next to each other, keeping the same order per row.
0
Entering edit mode

Thank you very much for taking the time to answer this. Appreciate it.

I have one final question: I get a ton of black regions in one of my heatmaps which I believe are missing data points. Is this fairly normal? (This isn't in Exons, but proposed enhancer regions). I've gone through your gallery of examples and it seems that there isn't a single figure that has as many black areas as the ones I have generated. I am thinking that it might be because my bigWig files were generated using the UCSC bedgraph to bigwig utility and not bamCorrelate.

Just incase its of use I am taking a bedfile of proposed enhancer regions generated by overlapping multiple histone marks (and lack of) using bedtools, and then using a H3K4me1 score file (bigwig) that was generated using bedgraph to bigwig utility when the bedgraph file came from MACS2. I just want to make sure that my bedfile of proposed enhancer regions are enriched with H3K4me1and deriched in H3K4me3.

The command I have been using has been as follows:

computeMatrix scale-regions \
--regionsFileName /Users/Carlos/Dropbox/000temp/Active-Enhancers-Intergenic.bed \
--scoreFileName /Users/Carlos/Dropbox/ChIP-Seq/hg19/old_data/Sample_H3K1/Tracks/H3K1.normalized.bigWig \
--regionBodyLength 1000 \
--startLabel "EnhSS" \
--endLabel "EnhES" \
--beforeRegionStartLength 8000 \
--afterRegionStartLength 8000 \
--numberOfProcessors max \
--skipZeros \
--outFileName Enhancers-Active-Intergenic-H3K4me1-computeMatrix-scaledregions-8k


The beforeRegionStartLength and afterRegionStartLength are just arbitrary numbers that I decided on based on some literature I've read, I had first assumed this might be the reason for having so many black regions, but after lowering it to 1k on each side, it looks more or less the same.

PS: Sorry for the many possibly uneducated questions. I literally had to learn terminal in a few days since I'm primarily a wetlab scientist.

Thank you!

1
Entering edit mode

I actually don't know the answer to this one. I'll have Fidel reply since he would be the person who definitely knows.