Recently I am analyzing microarray data and I want to know which pathways are significantly altered. I am using KOBAS now,which is based on KEGG. Are there some other tools that can identify altered pathways based on statistics?
The relevance of complex statistical algorithms for pathway statistics is often overestimated. There are many biological and technological reasons why pathway statistics can never be perfect. As a result it should always be used to get an impression of the relevant pathways which should then be visually inspected or analyzed using different tools.
Just some of those reasons...
- The input data (often gene expression data) themselves are often not trustworthy.
- The importance of a single gene product in a pathway can differ. It can be an important hub or something in a subbranch.
- Along the same line for two consecutive reactions the first could be catalyzed by one enzyme and the second by four. The single enzyme probably is important, but you just don't know about each of the four individually.
- Pathways can contain parts that are not really relevant to the question studied (you could say that part is not relevant to your question), now adding an irrelevant part where no changes occur will lower the statistical significance, while it does not affect the biological meaning at all.
- On the other hand significant changes in pathways can occur in parts that are not really meaningful for the question studied. So you would list the pathway as being changed but the core process itself is not really affected.
That being said PathVisio has plugins that allow Z-score based pathway statistics and another plugin that does gene set enichment analysis is currently being tested. See: http://www.pathvisio.org/wiki/PluginDocumentation . The core advantage is that you can use whichever pathway set you prefer (WikiPathways, KEGG or your own set of tables transformed into pathways) and can immediately look at a data representation in the pathways that occur at the top of the list.
Many people have used Gene Set Enrichment Analysis, developed at the Broad, for this problem. The algorithm is wrapped in a very slick java application. In essence, quoting the overview, it's "a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes)." There are large libraries of pathway-associated genes to choose from. GSEA operates off of your normalized microarray data, as opposed to lists of genes. It's most convenient if you are using one of the mainstream commercial microarray platforms.