Question

How Do Pathway Databases Compare?

22

Entering edit mode

14.0 years ago

Shigeta ▴ 470

Is KEGG still the reigning champion? Any comparisions of MetaCyc? It looks fairly complete for Ecoli at least.

pathway database subjective • 21k views

ADD COMMENT • link updated 15 months ago by Peter Karp ▴ 30 • written 14.0 years ago by Shigeta ▴ 470

0

Entering edit mode

I tried using SEED for a pipeline and got frustrated with inconsistencies in different database files: subsystems, subsystems2role, etc.

ADD REPLY • link 14.0 years ago by Science_Robot ★ 1.1k

0

Entering edit mode

Thanks for the input - i am being lazy as there are so many DBs and so many references. Feel like I can start with these answers. Ideally some combination of accuracy and coverage is best. After 10 years working in bioinformatics, I find the 'subjective' tag applies in way more cases than I had expected.

ADD REPLY • link 14.0 years ago by Shigeta ▴ 470

score 27 · Answer 1 · 2010-11-09

I have been doing some annotation work for Reactome. Compared with KEGG, Reactome stores a lot of information more:

the GeneOntology Ids for the Metabolic Process of each reaction, plus the GO for the localization of each component of a reaction
a text to describe what is happening in each reaction, plus a reference
the names of the author of each annotated pathway
the SBML and BioPax version of the pathway

The problem with KEGG is, in my opinion, that they do not put neither references nor description for the reactions. A certain pathway may be annotated in a way and if there is something unclear about a reaction, there is no way to find the references that have been used to justify it, and it is difficult to contact the original authors of the pathway. The entries in KEGG are 'authored by experts in the field' but it is impossible to know who these experts are, neither why they made certain choices.

Moreover, KEGG has a somewhat artificial distinction between metabolic and protein-protein interaction pathways; in Reactome, you can use the GO ids to distinguish the types of reactions, without splitting the reactome artificially.

Finally, there are other databases for pathways:

The Nature Pathways database
Uniprot pathways is nice but in my opinion has a very ad web interface, which makes it difficult to access to its contents.
BioCarta which has nice charts and figures
SignaLink about signalling pathways. I can't tell you much about it since I do not know much about signalling pathways, but they have a manual curation process.
The Edinburgh Metabolic Pathways database - you have to register to use it.

score 10 · Answer 2 · 2010-11-09

The creators of MetaCyc have compared it with other pathway databases:

The MetaCyc database paper includes section Comparison of MetaCyc and KEGG
The MetaCyc user guide also has a section "Comparison of MetaCyc to other Pathway Databases"

I don't know about independent comparisons, but searching PubMed for "metabolic pathway databases" throws up about 80 review articles.

In general, it's difficult to compare different resources objectively. They often have slightly different aims, design philosophies and of course, data access tools and formats. Most people choose one based on "look and feel" and how well it suits their particular project needs.

Giovanni M Dall'Olio · Answer 3 · 2010-11-11

7

Entering edit mode

14.0 years ago

User 4910 ▴ 70

I am in total agreement with Giovanni, KEGG is definitely loosing their edge, for several reason,

Strict licensing policy of KEGG. Contrary to that Reactome is released under CC so it gives you a lot freedom especially if you are a commercial user.
People use KEGG due to their maps but it seems their approach lacks interoperability with other tools and formats. In terms of interoperability, Reactome is ahead of others, they have support for BioPAX, SBML and latest release includes SBGN.

As matter of fact none of pathways are as comprehensive as they claim, see following reports

Pathway analysis software: Annotation errors and solutions

Consistency, comprehensiveness, and compatibility of pathway databases.

ADD COMMENT • link updated 14.0 years ago by Giovanni M Dall'Olio 28k • written 14.0 years ago by User 4910 ▴ 70

0

Entering edit mode

interesting articles. I fixed a typo in the second link.

ADD REPLY • link 14.0 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

i agree about kegg - very little of the data is open enough.

ADD REPLY • link 14.0 years ago by Shigeta ▴ 470

score 6 · Answer 4 · 2010-11-09

EcoCyc (and Metacyc) seems to have the philosophy of 'know as much about a pathway as possible' where as KEGG seems to have a 'know as many pathways as possible' approach. Typically I would always look in EcoCyc first as my gold standard before going to another database: but perhaps I am biased as I used to work for a group that curated EcoCyc.

I guess it just depends the question you are asking, the species and the area you are working on as well as your definition of 'pathway' (e.g. is a protein-protein interaction a part of a pathway?)

For example biogrid is excellent for some of the pathways that they focus on, especially if you want to know any possible connections and are willing to look at the evidence for each part of the pathway/interaction. e.g. they have put a lot of work into arabidopsis and are currently working on the ubiquitin 'pathway'.

Also as a previous post mentioned, reactome is worth looking at: a new version has just been released.

score 6 · Answer 5 · 2012-01-20

6

Entering edit mode

12.8 years ago

Miranda ▴ 340

I compared five pathway databases that describe the human metabolic network and the differences are quite large. For example, only 510 of the 3858 genes they have combined could be found in all five databases. For further detail see: http://www.biomedcentral.com/1752-0509/5/165.

There is no easy answer to which one is "best", as this really depends also on the analyses you are using it for.

ADD COMMENT • link 12.8 years ago by Miranda ▴ 340

1

Entering edit mode

That's a nice paper. Thanks for posting the link.

ADD REPLY • link 12.8 years ago by Alex Paciorkowski 3.5k

score 4 · Answer 6 · 2010-11-12

Pathway databases are curated by companies or different research groups and these resource have lot inconsistencies. I have noticed that the different databases have different number of genes in same pathway. Different databases use modified/specific pathway names etc. So a direct comparison of pathway database will be difficult.

I would recommend you to be motivated by what is your biological question, select appropriate data resources or take union or intersection of different resources. While selecting resources do consider the curation strategy and experimental method used to associate the genes with pathway.

Inaddition to the resources listed here: I would recommend you to take a look at WikiPathways and Pathway Commons, which provide a unified resource to access different pathway databases.

score 2 · Answer 7 · 2010-11-09

2

Entering edit mode

14.0 years ago

Larry_Parnell 16k

Look to see what the top research groups in your field are using - check their publications or ask them directly by email/telephone/at conference. In this manner, you will be using the "accepted" database. If you still consider another pathway database superior, then use both and offer a comparison - doing such will add to the conversation about comparing the two.

ADD COMMENT • link 14.0 years ago by Larry_Parnell 16k

1

Entering edit mode

I strongly disagree with simply going along with what top labs are using. Just because a reputed lab is using some database does not mean one should blindly use. Although seeing what top labs prefer may be a quick way to finding a potentially good source, as scientists, we need to be skeptical of what each database actually represents, how they get their data, how up-to-date it is, and how their information is validated, and, in this case, how they define biological pathways. It is known that this is not uniform across databases, leading to the presentation of bias.

ADD REPLY • link 3.4 years ago by some1 ▴ 10

score 2 · Answer 8 · 2010-11-10

2

Entering edit mode

14.0 years ago

Suk211 ★ 1.1k

I have never used BRENDA , but have heard it's more comprehensive compared to other pathway databases.

ADD COMMENT • link 14.0 years ago by Suk211 ★ 1.1k

score 1 · Answer 9 · 2010-11-11

1

Entering edit mode

14.0 years ago

Jdnavarro ▴ 410

If you are looking for cancer and immune pathways you might want to check http://netpath.org. They pathways they have are quite comprenhensive and can be downloaded in BioPax format.

ADD COMMENT • link 14.0 years ago by Jdnavarro ▴ 410

score 0 · Answer 10 · 2012-04-09

I work for ProteinLounge that has a commercial database of biological pathways(http://www.proteinlounge.com/pathways/). The graphics make it easier to visualize than other databases like say KEGG. As others have said, each pathway database has their own look and feel. However, if someone wants a database that is more visually eye catching than the ProteinLounge pathway database is definitely one that a user should look into.

score 0 · Answer 11 · 2023-07-20

0

Entering edit mode

15 months ago

Peter Karp ▴ 30

The BioCyc team has created a more detailed comparison of BioCyc and KEGG covering both their data content and their informatics tools. The comparison is here:
https://bioinformatics.ai.sri.com/biocyc/kegg-biocyc-comparison.pdf

ADD COMMENT • link 15 months ago by Peter Karp ▴ 30