Question: Gene enrichment analysis with a simple gene list compared to microarray data
gravatar for nash.claire
5.7 years ago by
nash.claire410 wrote:


I wonder if anyone can help. I have a list of candidate genes from a previous proteomics experiment (so just a simple list of genes) that I'd like to see if they are enriched in some publicly available gene expression microarray data sets. Is there a bioinformatics tool out there that will allow me to compare just a simple gene list to microarray data and still get some statistics back? 

If not, can anyone suggest how I might go about this analysis in a different way? Is it possible to do some sort of correlation analysis when I'm essentially comparing 1 list of gene names to another??

I have tried NetVenn but this requires me to input 2 gene lists and then compare to gene expression microarray data not 1 gene list.


I really look forward to hearing back!

genome gene • 2.4k views
ADD COMMENTlink modified 2.6 years ago by a_liberzon0 • written 5.7 years ago by nash.claire410



Thank you both very much for your advice. I will perhaps give iPathway a try. Is this free software or subscription only??

ADD REPLYlink written 5.7 years ago by nash.claire410

It is 100% free to use.  You can upload as much data as you wish.  Results are available for 72 hours at which point you can purchase the report to keep long term. You can either purchase a single report or you can purchase a subscription.  The point is, you can see all of your data for free, then purchase if it makes sense.  

If you sign up, let me know, and I'll be happy to give you three free reports to keep.  Just mention you learned about it here.  This will allow you to get to your comparison for free. 

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by andrew510
gravatar for Istvan Albert
5.7 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

You can't really compare a genelist to deposited microarray data in an automated way because the results of these experiments are not stored in a searchable format. Only the original data is.  

What you could do is perform an enrichment analysis on your genelist, identify functions of interest and then search the literature for publications that studied this same system. Downloading their results and genelists would perhaps give you something to compare to. 

ADD COMMENTlink written 5.7 years ago by Istvan Albert ♦♦ 85k
gravatar for andrew
5.7 years ago by
United States
andrew510 wrote:

Please keep in mind that one of the key limitations to any enrichment analysis is that it assumes the variables are independent, but we know that genes are highly dependent on each other in various systems.  So you will likely get a number of false positives using any kind of gene set enrichment.

We offer a tool called iPathwayGuide, that will "almost" do what you are looking to do. We still require you to upload two sets of data.  Soon, however, we will offer the ability to process publicly available data (e.g. from NCBI-GEO) and then input your list of genes to understand what systems given that phenotype comparison are those genes of interest implicated.  That new capability should be out soon.

For now, however, what you can do is find a representative public data set, run it through GEO2R, upload the resulting differential expression data into iPathwayGuide, then reprocess the same data, but artificially make your genes of interest DE by giving them a significant p-value (e.g. 0.01) and all others, an insignificant p-value (e.g. 0.5).  The key, is you want to preserve the logFC.  The reason for this is one of the key analyses we perform is a perturbation analysis.  iPathwayGuide will take the gene expression for your target genes and propagate that perturbation downstream.  From this we can identify which pathways are most perturbed.  This method virtually eliminates false positives.  Then you can compare the two data sets using our meta analysis.  This will confirm where any overlap occurs.

Here's a screenshot of the meta analysis for pathways comparing three datasets.

ADD COMMENTlink written 5.7 years ago by andrew510
gravatar for a_liberzon
2.6 years ago by
Broad Institute
a_liberzon0 wrote:

Try our Investigate Gene Sets tool online at Note that your list should not exceed 2,000 gene (or protein) identifiers, and that the online tool will only show up to 100 most significant results. If you have many lists like that, then I'd recommend downloading our gene sets from MSigDB and implementing the hypergeometric test off line. One way to do that is by calling phyper() function in R for example.

ADD COMMENTlink written 2.6 years ago by a_liberzon0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 725 users visited in the last hour