Question

Heatmap when having duplicated genes and samples

0

Entering edit mode

2.5 years ago

BioQueen ▴ 30

Hi! I have done a transcription factor inference analysis and I now want to display the result in a heatmap.

I have 3 columns: source, condition and score where the source is the gene names, the condition is the samples and the score is the enrichment score. The problem is that I have the enrichment score for each gene for each samples. Say I have the enrichment score for gene1 for sample1, gene2 for sample1 and gene3 for sample1, I also have the ES for gene1 for sample2, gene2 for sample2 and gene3 for sample2.

So my problem is that I'm not able to make the genes to rownames and the samples to colnames due to duplicates. Does anyone know how I can make a heatmap and display the enrichment score with samples on the "x-axis" and genes on the "y-axis"?

Thanks!

heatmap duplicated-genes-samples • 1.0k views

ADD COMMENT • link 2.5 years ago by BioQueen ▴ 30

0

Entering edit mode

Can you post first few lines from your table and also your expected output table?

ADD REPLY • link 2.5 years ago by kashiff007 ★ 1.9k

0

Entering edit mode

I only have one table and it is that table I want to use for the heatmap visualisation. Here is an example, I don't know how to write it in this comment box, but I will try to make it understandable.

source condition score
gene1 sample1 0.001
gene2 sample1 0.0003
gene3 sample1 -0.0004
gene1 sample2 0.003
gene2 sample2 0.0001
gene3 sample2 -0.0005

Dont mind the numbers in front of the table. Does this make it clearer?

ADD REPLY • link 2.5 years ago by BioQueen ▴ 30

score 6 · Accepted Answer · 2021-11-04

6

Entering edit mode

2.5 years ago

kashiff007 ★ 1.9k

I guess you are using R. load the r package data.frame:

library(data.table)

Your data would look like:

>  df
        source  condition   score
1       gene1   sample1     0.4299397
2       gene2   sample1     0.4299397
3       gene3   sample1     0.4299397
4       gene1   sample2     0.2531551
5       gene2   sample2     0.2531551
6       gene3   sample2     0.2531551

Now use this command for "unmelt" your df:

dcast(df, source ~ condition, value.var = c("score"))

The output would look like:

  source   sample1   sample2
1  gene1 0.4299397 0.2531551
2  gene2 0.4299397 0.2531551
3  gene3 0.4299397 0.2531551

ADD COMMENT • link 2.5 years ago by kashiff007 ★ 1.9k

3

Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.

enter image description here

ADD REPLY • link 2.5 years ago by kashiff007 ★ 1.9k

0

Entering edit mode

Thanks! That worked :)

ADD REPLY • link 2.5 years ago by BioQueen ▴ 30