Question: Finding overlapping genes between different gene sets
0
gravatar for kim
4.8 years ago by
kim60
United States
kim60 wrote:

I have an issue calculating overlap percentage.

From functional annotated analysis, I got 49 different gene sets and have tried to calculate how many genes are overlapped between two different data sets. I'd like to drawing a figure like this (Fig1B). Are there any ideas or recommendations? 

http://www.ncbi.nlm.nih.gov/pubmed/20926834

 

Actually, I am still new to R but prefer to R.

R overlap • 3.6k views
ADD COMMENTlink modified 4.8 years ago by Jason880 • written 4.8 years ago by kim60
1
gravatar for Jason
4.8 years ago by
Jason880
United States
Jason880 wrote:

It's difficult to tell what problem you are having in reference to calculating overlap. If you could edit your post to add more info you may get more help. For instance, show us some of the data and what you want the output to look like. 

When it comes to drawing heatmaps there are previous posts on that on this site related to that. I'd also check out the following packages/tutorials. 

This post from stack overflow (actually produces the half heatmap you want)

http://stackoverflow.com/questions/6883618/plotting-a-heat-map-for-an-upper-or-lower-triangular-matrix

The following links make regular square heat maps but you could easily make the figure like the one you linked by importing the figure into photoshop, powerpoint, etc and editing out half of the heatmap. 

Here's the heat map package in R

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/heatmap.html

Heatmap tutorial

http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/

Heatmap.2 tutorial 

http://davetang.org/muse/2010/12/06/making-a-heatmap-with-r/

Heatmap with ggplot2

http://socialdatablog.com/heatmap-tables-with-ggplot2-sort-of/

 

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Jason880
1

Thank you so much for your help :)

Actually I obtained 49 enriched gene sets using GOstats R. Plus, I have annotated gene lists of each 49 data sets. Here, I'd like to figuring out 1) how each category share same genes together and/or 2) which genes are annotated on different categories among total 49 sets. GO datasets are displayed with GOBP ID and genes are expressed with ENTREZ ID.

   GO.0071779   GO.0006091  GO.0031570  GO.0051179
1       5682         32       4436         32
2       5684        226       5682         92
3       5693        498       5684        116
4       5700        506       5693        161
5       5705       2026       5700        226
6       5706       2271       5705        372

Following your recommendations, I have plotted different heatmaps using function heatmap and image. But.. these gave me a little different format what I expected before (figure from published paper I'd attached earlier). But, I'm trying to make one..

 

ADD REPLYlink written 4.8 years ago by kim60
1

It is very interesting, if you find the way to do it, please also share it here. 

ADD REPLYlink written 4.1 years ago by Mo880
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1761 users visited in the last hour