Constructing An Heatmap Of "Distance Of Binding Region Relative To Tss"
4
1
Entering edit mode
11.9 years ago
Dataminer ★ 2.8k

Hi!

I have ChIP-seq profile for a transcription factor. I want to construct a heat map in which I can view the distance between peaks and TSS. What I have done: Annotated the peaks (bounded genomic regions by TF's) to nearest TSS. This means I have the co-ordinate of the TSS from the nearest peak.

I need guidiance in constructing heatmap for my TF's using the coordinates of nearest TSS and the coordinates in my peak file or the raw .BED file(from which peaks were called).

A small example script in python or in R is welcomed.

Thank you for your time.

Best

chip-seq next-gen • 7.2k views
ADD COMMENT
0
Entering edit mode

How do you plan to get the extra dimension for heatmap? Wouldn't it just be a histogram chunked by distance?

ADD REPLY
0
Entering edit mode

Hi brent, I was expecting your comment. Actually I saw a heatmap in few articles depicting the same "Examination of transcriptional network reveals an important role for TCFAP2C, SMARCA4, and EOMES in trophoblast stem cell maintenance"-Benjamin L. Kidder. I am very curious to know how these people do it? But anyway, you tell me what is the best way to do this and how it can be done? You can also look at this link http://genome.cshlp.org/content/21/2/245/F2.expansion.html Thank you

ADD REPLY
0
Entering edit mode

which figure exactly? and what are the axes?

ADD REPLY
0
Entering edit mode

please have a look at this figure http://genome.cshlp.org/content/21/2/245/F2.expansion.html here they have used multiple TFs

ADD REPLY
4
Entering edit mode
11.9 years ago
Duff ▴ 670

Hi Dataminer

I recently created a similar figure (not quite the same but the code should be adaptable I think) after using clover for TFBS enrichment analysis for a group of regulated genes. My code uses ggplot2 in R and plots each interaction (TF (y axis) - gene (x axis)) as a square with the number of hits for that TF in the promoter of the gene shown by the colour of the square.

You should be able to adapt the code to show distance from TSS for each gene pretty easily (just put different numbers in the relevant column which was Hits in my data)

The plot:

p3 <- ggplot() + geom_point(data = tfHits, aes(symbol, TF, colour = Hits), shape=15, size = 4)

p3 <- p3 + scale_colour_gradient(low = "cornflowerblue", high = "firebrick") + opts(panel.background = theme_blank(), legend.position = "right", axis.title.x = theme_blank(), axis.title.y = theme_blank(), axis.text.x = theme_text(angle = 90, hjust=1, size=6), axis.text.y = theme_text(colour = "black"), axis.ticks = theme_blank())

The data - a dataframe with 3 columns: TF in first, gene (symbol) in second and distance to TSS in third. I would show an excerpt of my data but I can't work out how to get a 'table' into the text here - hey ho.

You can do a similar plot in ggplot2 with the 'tiles' geom:

p2 <- ggplot(tfHits, aes(TF, symbol)) + geom_tile(aes(fill=Hits))

p2 <- p2 + scale_fill_gradient2(name='Hits', low="#0571B0", mid="#F7F7F7", high="#CA0020", midpoint=20, trans="identity") 

p2 <- p2 + labs(x = "TF", y = "Gene") + opts(axis.ticks = theme_blank(), axis.text.x = theme_text(size = 10, angle = 90, hjust = 1, colour = "grey25"), axis.text.y = theme_text(size=5, colour = 'gray25'))

Personally I prefer the squares. Of course this won't do any kind of clustering - I don't know if that's important to you but you could reorder the dataframe passed in by some dendrogram order etc etc.

HTH

duff

ADD COMMENT
0
Entering edit mode

Thank you, I will try to adapt the script. :)

ADD REPLY
4
Entering edit mode
11.9 years ago

You might consider using the ade4 package to plot something like this, in R:

library(ade4)
example(table.value)

enter image description here

I have used it and customized it to add a heatmap feature to the 'size proportional to value' feature, so let me know if you would like me to share part of this code with you:

enter image description here

ADD COMMENT
3
Entering edit mode
11.9 years ago

Hey, to help you a bit on that.

Step1 -> Sort your list (distance of peak from TSS) in small bins may be like 100-200 bp and check the count how many lie in them.

Step2 -> Make a dataframe of name of proteins in one column, and following columns will be distance from TSS on the basis of bins you made.

Step3 -> Plot fancier heat map like using ggplot2 and geom_tile.

I putting some custom code to proceed from the point you have data frame.

# Generate dataframe
df=data.frame(c(paste('A',seq(1,20,by=1))),seq(1,200,by=10),seq(1,100,by=5),seq(1,60,by=3),seq(1,40,by=2))

# load some libraries
library('ggplot2')
library('rescale')
library('scales')

# rename columns 
colnames(df)[1]='Proteins'
colnames(df)[1]='1KB'
colnames(df)[3]='2KB'
colnames(df)[4]='-1KB'
colnames(df)[5]='-2KB'
# melt the df
df.m=melt(df)

# add a rescale column which gives intensity ratios to the distance column on the basis of your min and max value
df.m=ddply(df.m,.(variable),transform,rescale=rescale(value))

# finally plot
ggplot(df.m,aes(variable,Proteins))+geom_tile(aes(fill=rescale),colour='white')

Result

enter image description here

You can extend it and sort the axis as well.

Have fun

Sukhi

ADD COMMENT
2
Entering edit mode
11.9 years ago
Vikas Bansal ★ 2.4k

Hi, I think as Brent suggested you can draw a histogram. But after looking at the image from Kidder's paper, I got some idea. So according to that image, you can make a dataframe , say, with 5 TF's and 3 rows which are going to be 6kb upstream, on TSS and 2.5 kb downstream (you can change according to your preference) or make say 10 rows - 1k upstream, 2k upstream...so on. Now just put the frequencies in that dataframe. Eg if there are 100 regions which are present at the distance up to 1k upstream for TF A, then put this value (100) in TF A vs 1k upstream. So now you have dataframe with frequencines for each TF and you can use heatmap function in R to make a heatmap (change colors, size etc acc. to your preference).

Sorry for not providing the script as I am not in the lab otherwise I would have tried.

ADD COMMENT

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6