Unusual pattern in heatmap from ChIP-seq
0
1
Entering edit mode
7 months ago
Marco Pannone ▴ 790

Hello!

I am plotting some heatmaps for a certain histone mark from a ChIP-seq experiment. I am producing the heatmaps using both computeMatrix and plotHeatmap functions from deeptools.

The coordinates regions used for computeMatrix are all coding regions for the mouse genome, sorted in descending order depending on length.

While I do expect indeed enrichment at the TSS (since in this case I am plotting H3K4me3 signal), I also noticed this unusual pattern with a sort of "V" shape. I wonder if it is normal or if there might be some issues I am not aware of.

Heatmaps for H3K4me3

This is my command line for computeMatrix:

computeMatrix reference-point --referencePoint TSS -p 6 -S path_to/sham_h3k4me3.bw path_to/contra_h3k4me3.bw path_to/ipsi_h3k4me3.bw -R grcm39_chipseq/coding_genes_coordinates/coding_genes_coordinates_mm39.txt -b 5000 -a 15000 --skipZeros --sortRegions keep -o path_to/gene_coding_regions_h3k4me3_matrix.gz

This is my command line for plotHeatmap:

plotHeatmap -m path_to/gene_coding_regions_h3k4me3_matrix.gz -o path_to/h3k4me3_gene_coding_heatmap.pdf --sortRegions no --colorList white,red --heatmapWidth 8 --zMin 0 --zMax 60 --heatmapHeight 40 --outFileSortedRegions path_to/h3k4me3_gene_coding_heatmap.bed --samplesLabel "Sham H3K4me3" "Contra H3K4me3" "Ipsi H3K4me3" -z "Coding genes - H3K4me3" --whatToShow 'heatmap and colorbar'

Thanks in advance!

chip-seq heatmap deeptools • 1.2k views
ADD COMMENT
0
Entering edit mode

Does your region file have strand information?

ADD REPLY
0
Entering edit mode

Yes it has strand information (+ or -).

ADD REPLY
0
Entering edit mode

Maybe share a few lines of the region file?

ADD REPLY
0
Entering edit mode

Here is the top 10 lines of the region file

#chrom  chromStart  chromEnd    name    strand  length
chr7    129764180   132725079   ENSMUST00000124096.8    -   2960899
chrX    81992508    84249747    ENSMUST00000114000.8    +   2257239
chr6    45036994    47281147    ENSMUST00000114641.8    +   2244153
chr16   39804732    41996658    ENSMUST00000187695.7    +   2191926
chr4    76057618    78130132    ENSMUST00000107287.9    -   2072514
chr2    40485259    42543636    ENSMUST00000052550.13   -   2058377
chr2    140237228   142234886   ENSMUST00000110067.8    +   1997658
chr2    140237275   142234886   ENSMUST00000110064.8    +   1997611
chr2    140237354   142226308   ENSMUST00000078027.12   +   1988954
ADD REPLY
0
Entering edit mode

I believe deepTools expects chrom start end name **score** strand

ADD REPLY
0
Entering edit mode

Also, I believe you can have deeptools sort by region length for you with the benefit of possibly adding a line marking the region end. Probably doesn't make sense for K4me3, but just in case its useful.

ADD REPLY
0
Entering edit mode

Thanks for your replies, I will give it a try. However, regarding the columns computeMatrix expects in the regions file, I think it only matters that the first 3 columns are: chrom start end, as a regular bed file. All the other columns should not be taken into account.

ADD REPLY
0
Entering edit mode

Yes, that is true, minimally it only needs chrom start end. However, it can also take into account additional columns when provided in standard formats (BED6 and BED12 is how they refer to it, I believe).

This is important if you want to line up TSS since you need to take into account strand. + strand items will have TSS in the start column, while - strand items will have TSS in the end column.

You can either do this manually by narrowing regions to their actual TSS, or use deepTool's functionality which will automatically take it into account if you provide strand information. I believe that will solve your problem where you see the promoter signal at TSS and TES. You can also provide a GTF file to deeptools.

ADD REPLY
0
Entering edit mode

Alright, thanks! I will definitely give it a try :)

ADD REPLY
0
Entering edit mode

Can you share the code on how these regions were obtained and sorted?

ADD REPLY
0
Entering edit mode

I have obtained the coding regions for mm39 through UCSC Table Browser. Then, I simply sorted the regions myself based on the length of each of them (end coordinate - start coordinate).

ADD REPLY
0
Entering edit mode

Coding regions means what exactly? Exons, or cDNA, the latter without introns?

ADD REPLY
0
Entering edit mode

The regions have been obtained by selecting the following fields on UCSC Table Browser:

Clade: Mammal
Genome: Mouse
Assembly: mm39
Group: Genes and Gene Predictions
Track: GENCODE VM32
Table: knownGene
ADD REPLY

Login before adding your answer.

Traffic: 2436 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6