How can I present gene expression values for multiple genes in various cancers?
0
0
Entering edit mode
7.3 years ago
arman s • 0

I’m developing a package for analyzing High-throughput data stored in an online repository. I want to compare expression z-scores of multiple genes in several cancers by using RNA-seq or microarray data. I'm using three heatmaps for showing the mean of expression, the frequency of samples whose z-score values are bigger than specific number and median of samples. In each heatmap, row names are genes and colnames are cancers. What is a good way to present these data in a professional way? I appreciate any advice.

RNA-Seq microarray • 2.9k views
ADD COMMENT
0
Entering edit mode

What is the message you're trying to convey with this ? Data visualization is best done to illustrate a specific point or when tied to answering a specific question.

ADD REPLY
0
Entering edit mode

I'm trying to present data in a manner that helps identify which genes have an important pattern of expression in different cancers. What I do is take a look at frequency heatmap and see, for instance, gene "x" is overexpressed in 45 percent of individuals in specific cancer. Furthermore, mean of expression heatmap shows that the expression value in interestingly higher. Then I compare the mean expression heatmap with median heatmap to find out whether this higher expression value is caused by outliers or not. So far, gene "x" may be differentially expressed in two cancers of a total of twenty cancers. The candidate genes may be subjected to further experimental analysis. How would you present data to help identifying the differentially expressed genes?

ADD REPLY
0
Entering edit mode

Why not use the median directly instead of comparing the mean to the median ? You could have a heatmap representing the difference with the median or the number of median absolute deviation from the median.

ADD REPLY
0
Entering edit mode

I totally agree with you about the median. The reason why I included the mean of expression heatmap is that maybe some scientists still want to see the mean values too. This can help them determine the presence of outliers in different cancers. What is the benefit of using median absolute deviation? It can help to find genes that have highly variable mean/median across many cancers. I don't think it can be visualized as a heatmap. Do you agree with my strategy? Is there anything that I can do to make it better? I appreciate your kind response.

ADD REPLY
1
Entering edit mode

The MAD is more robust to outliers than the z-score. You should have different visualizations for different things e.g. one for detecting outliers and one for detecting differentially expressed genes. If you're interested in the penetrance of a particular gene expression, then you could show that as a heatmap where the value is the fraction of samples with expression below/above some threshold. You could combine this with the MAD by using pie charts or tree maps inside the cells of the heatmap but I think it's usually best to only convey one type of information per vizualization.

ADD REPLY
0
Entering edit mode

I forgot to mention that they are median z-scores. So, I assume they are calculated by using median and MAD instead of mean and SD. I can access expression values. But I have to normalize and convert them to log values. Furthermore, I can't find data for normal samples. On the other hand, I have participated in an online course before and the lecturer emphasized that never use pie charts. I would be grateful If you can kindly attach a picture that helps me understand how can I combine frequency data with the MAD by using tree maps inside the cells of the heatmap? Again, I appreciate your kind response.

ADD REPLY
1
Entering edit mode

What I had in mind is something like here. I don't know of examples of using treemaps in heatmaps but the idea is to replace the pie chart with a treemap. But trying to cram too much information in one representation tends to be counterproductive.

ADD REPLY

Login before adding your answer.

Traffic: 1081 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6