Working with Mean TPM Values for Heatmaps
0
0
Entering edit mode
3.9 years ago
nk130 • 0

I am new to examining RNA-seq data sets and generating heatmaps. Currently, I am working with the Mean TPM dataset from all cell types published to https://dice-database.org.

I've looked at methods for handling zero values, and I currently do the following: - Find the smallest, non zero, TPM in the entire data set. - set a small number close to zero, but smaller than the smallest non-zero mean TPM - replace zero values with this very small number. - take the log(mean TPM), and do clustering with this transformed data set.

What I am wondering though, is if instead I should set an expression threshold based on a few housekeeping genes for my cell types of interest? I ask because there are subsets of my genes of interest where across all of my subtypes, there is background or no expression. While useful to understand where things are, it takes up space in these heatmaps! There other consideration is how to interpret some of these smaller values as real expression, or not, to get a better understanding of dynamic range.

RNA-Seq TPM Heatmap DICE • 1.7k views
ADD COMMENT
0
Entering edit mode

I've looked at methods for handling zero values

Why not keep zeros as zeros? You can make heatmaps with zeros.

ADD REPLY
0
Entering edit mode

See an edit- I take the log(TPM) to make the heatmaps.

ADD REPLY
0
Entering edit mode

If the problem is log(0), then add a small number. Some people do log(x+0.001), some do as high as log(x+1) to avoid having negative values. For the purpose of a heatmap, you probably won't even notice the difference.

ADD REPLY

Login before adding your answer.

Traffic: 2435 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6