Hi All,
I am having a set of 50 genes. I have downloaded the GTex RNA Seq data (both read count and rpkm). The data-set contain ~40 tissue samples with multiple biological replicate. What is the best way to identify the expression pattern of the genes among the tissues and to generate a heat map for the genes? I want to generate a plot showing my genes are expressed highly in particular tissues only.
You can always throw any data matrix into a heat map tool to get a heat map, but that won't tell you what is statistically significant. Do you have a hypothesis? Have you tested it and found a list of genes that are highly expressed in your tissues already, or is that something you need to do still? Or maybe you already have a list of focus genes and want to see if they are differentially expressed across that tissues? Depending on what exactly you want to see, a heat map may or may not be applicable. For example, if you have a gene of interest you could have a box plot showing the distribution of expression for the gene across the replicates in all tumors. The appropriate statistical test would tell you if the expression of this one gene is significantly different in you tumors of interest.
Thanks alolex. I have focused set of genes for which I want to test whether they are particularly expressed in high degree at certain tissues or not. What I am thinking of doing is to take the average of the biological replicates for the each tissue type and than compare within tissues.
You can certainly average the replicates to visualize if you like, but note this method won't tell you if your genes are statistically significantly over-expressed in your tissues of interest. You would need to do a differential expression analysis or a statistical test to figure that out. The design of these tests would depend on the specific questions you have, so if you need the statistics for a publication I would consult a statistician if you can unless you are familiar with the tests you need to run. Otherwise, to get a rough idea of if some genes are really different you can take the average or just plop everything in a heat map to see the data.