heatmap for HTSeq data in R
0
0
Entering edit mode
3.5 years ago
Rob ▴ 170

Hi friends

I try to plot heatmap for significant genes with cut off FDR < 0.05 and Fold change abs(2).

I have two question:

  1. I want to have range of my colors -3 to 3 as my minimum and maximum values are in the file. I dont know why are change it to -4 to 4. How can I set this?
  2. How can I cluster columns inside each group (not all together). I mean blue groups together and yellow groups together.

Please help me with this.

my data file looks like this:

TCGA-DV-A4W0    TCGA-B0-5812    TCGA-CZ-4858    TCGA-CZ-5985    TCGA-B0-5706
ENSG00000136546 0.181830391 0.694867529 0.215741062 -1.185141681    -1.237408974
ENSG00000115507 -0.361309104    1.684467761 1.044270623 0.157551622 0.004584822
ENSG00000163207 -0.147714241    -0.147823434    -0.149200113    -0.150367088    -0.14789331
ENSG00000123560 -0.34272044 -0.687421791    0.05798495  -0.876653053    -1.48726563
ENSG00000178462 -1.063056021    0.040025267 -1.35758621 -1.075000431    -1.090410434
ENSG00000071991 -0.174786987    -0.174956939    -0.169668498    -0.179650013    -0.179436852
ENSG00000177354 -0.33433836 -0.346731513    -0.555600129    -0.928440339    0.259568992
ENSG00000081051 1.218425978 -0.058534672    -1.104272404    -0.046479571    -0.81553282
ENSG00000178919 -0.170636704    -0.184535399    -0.147903288    -0.185249758    -0.184555276
ENSG00000144227 -0.339595559    0.721073074 -0.557406478    -0.937305769    -0.35982469
ENSG00000186910 -0.405019206    -0.423136465    0.976417264 -1.308551699    -0.435060224
ENSG00000043355 1.234927266 0.663318914 2.980944167 -0.092653328    -1.004937028
ENSG00000164756 -0.131731478    -0.13173574 -0.131788961    -0.13147134 -0.131738464
ENSG00000140557 1.608839856 -0.206466577    -0.520509236    -1.871781298    -1.06273877
ENSG00000163825 -0.759650728    0.149011499 -0.194730763    0.087241581 -0.791364476
ENSG00000131771 -0.550962856    0.299444596 -0.816868438    -1.224446647    -1.311569538
ENSG00000148965 -1.029686968    -1.050990295    -1.410026967    2.414698793 -1.065010944
ENSG00000138675 0.564811739 -1.33506518 -0.834581288    -1.630498742    -1.340582556
ENSG00000129654 -0.423541548    0.663408733 -1.563152043    -1.719406198    -1.444017051

This is my heatmap

https://ibb.co/0FzL8DZ

This is the code I used:

data <- read.table("mydata.txt", sep = '\t',
                  header = TRUE, row.names = 1, stringsAsFactors = FALSE, check.names=FALSE)
mat <- as.matrix (data, stringsAsFactors = FALSE)

metadata <- read.table("metadata.txt", sep = '\t',
                       header = TRUE, row.names = 1, stringsAsFactors = FALSE)


sig_genes <- read.table("siggenes.txt", sep = '\t',
                        header = FALSE, stringsAsFactors = FALSE)[,1]


myCol <- colorRampPalette(c('dodgerblue', 'black', 'yellow'))(100)
myBreaks <- seq(-3, 3, length.out = 100) ## this line I set to -3 to 3 but it didnot help and the color range is -4 and 4 in resulted heatmap


pamClusters <- cluster::pam(mat, k = 2)
pamClusters$clustering <- paste0('Cluster ', pamClusters$clustering)

# order of the clusters top to bottom
pamClusters$clustering <- factor(pamClusters$clustering,
                                 levels = c('Cluster 1', 'Cluster 2'))



##create the actual heatmap object
hmap <- Heatmap(mat,


                cluster_row_slices = FALSE,
                name = 'Gene\nZ-\nscore',
                col = colorRamp2(myBreaks, myCol),

                # colour-bar that represents gradient of expression
                heatmap_legend_param = list(
                  color_bar = 'continuous',
                  legend_direction = 'vertical',
                  legend_width = unit(8, 'cm'),
                  legend_height = unit(5.0, 'cm'),
                  title_position = 'topcenter'),
                  #title_gp=gpar(fontsize = 12, fontface = 'bold'),
                  #labels_gp=gpar(fontsize = 12, fontface = 'bold')),
                cluster_columns = FALSE,
                # row (gene) parameters
                cluster_rows = TRUE,
                show_row_dend = TRUE,
                #row_title = 'Statistically significant genes',
                row_title_side = 'left',
                row_title_gp = gpar(fontsize = 12,  fontface = 'bold'),
                row_title_rot = 90,
                show_row_names = FALSE,
                row_names_gp = gpar(fontsize = 10, fontface = 'bold'),
                row_names_side = 'left',
                row_dend_width = unit(25,'mm'),

                # cluster methods for rows 
                clustering_distance_rows = function(x) as.dist(1 - cor(t(x))),
                clustering_method_rows = 'ward.D2',

                top_annotation = colAnn)


draw(hmap + genelabels,
     heatmap_legend_side = 'left',
     annotation_legend_side = "right", row_sub_title_side = 'left')
RNA-Seq • 1.1k views
ADD COMMENT
0
Entering edit mode

The legend breaks won't restrict min/max values in the legend because those are based on the data being plotted.

What is the output of max(data) and min(data)? You may have to manually set all values <-3 to -3 and all values >3 to 3.

ADD REPLY
0
Entering edit mode

in the data file min value is -3 and max is 3. I want my range to be the same. but here R changes it to -4 to 4. how can I set the range manually?

ADD REPLY
0
Entering edit mode

If the minimum is -3 and the maximum is +3, there is no reason the legend should show -4 -> +4

Can you try running mat2<-mat; mat2[mat2 < -3] <- -3; mat2[mat2 > 3] <- 3 and then use mat2 to plot the heatmap?

ADD REPLY
0
Entering edit mode

Thanks. It did not help. the range is still -4 to 4

ADD REPLY
1
Entering edit mode

I just noticed that you're creating myBreaks but not using it in the code. Try adding at = myBreaks to the heatmap_legend_param list. Change the length.out in myBreaks to something sensible like 5 or 7 instead of 100.

ADD REPLY
0
Entering edit mode

Thanks I am trying to fix this. Also I am trying to cluster column inside each group separately. How can I do this? I don't want to mix clustering two groups.

ADD REPLY
1
Entering edit mode

You can't do that, or the heatmap won't make sense. If you need both rows and columns to be clustered independently on the split, use 2 heatmaps.

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6