Question: How to combine a different sources of pathway database (KEGG, BioCarta) for Pathway analysis.
4
gravatar for kmsh410
2.8 years ago by
kmsh41040
kmsh41040 wrote:

Hi all Biostars, I have a question about how to combine different pathway database resources, since I want to ensure comprehensive coverage of pathways for conducting my pathway analysis. I got a gene-list from VEGAS2 website. Now I want to use this gene list to run pathway analysis. Can someone give me some idea/direction to solve this probelm?

ADD COMMENTlink written 2.8 years ago by kmsh41040
1

I will not suggest to combine different pathway databases and do enrichment analysis (the statistics will be non-reliable due to its redundancy). Instead you can use the tools that uses most updated information to perform enrichment analysis.

If you are not a Linux user or R user, try web based tools like DAVID (bit outdated information).

But if you are familiar with Linux, use GeneSCF (uses updated information directly from the source in real-time). You can try by using KEGG or REACTOME as source database.

Gene Set Clustering based on Functional annotation (GeneSCF)

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by EagleEye6.2k
2

The DAVID has been up updated recently with refined features.

ADD REPLYlink written 2.8 years ago by Prasad1.5k

Yes I know it is. But still it does not hold CURRENT (realtime) versions of KEGG or geneontology. The point I was telling before is updating the functional annotation to do enrichment analysis (not the tool but the annotation).

https://david.ncifcrf.gov/content.jsp?file=release.html

And still KEGG and BIOCARTA is not a new version as I can see.

https://david.ncifcrf.gov/content.jsp?file=update.html

http://www.genome.jp/kegg/docs/relnote.html

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by EagleEye6.2k

Actually, I want to use some R packages on Bioconductor, such as gage and paxtoolrs packages, but I need to take some time to figure out how to use these packages. I haven't use these tools from Bioconductor before. I also found some experts recommended Enrichr. This one is very convenient, but I still want to learn how to use R or Linux to do analysis.

By the way, thanks you for introducing GeneSCF for me. I will try it.

ADD REPLYlink written 2.8 years ago by kmsh41040
1

If you get paxtoolsr installed you can grab all the gene sets provided by Pathway Commons by running this command:

geneSets <- downloadPc2("PathwayCommons.8.All.GSEA.hgnc.gmt.gz", version="8")

as a list of vectors. You'll likely want to filter out pathways thas have one or few genes in them using code similar to below:

tmp <- unname(unlist(lapply(geneSets, length)))
idx <- which(tmp > 3)

# Gene sets with more than 3 genes
filteredGeneSets <- geneSets[idx]

Pathway Commons does its best to ensure the best reproduction of the original datasets, which sometimes includes these small pathways.

If you're having problems installing paxtoolsr (especially because of the Java dependency), these videos may help, which provide instructions for OSX, Linux, and Windows:

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by cannin250

Good luck with your analysis.

ADD REPLYlink written 2.8 years ago by EagleEye6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 847 users visited in the last hour