I'm trying to get lists of genes belonging to KEGG pathways. I had been using MSigDB, but some of the KEGG pathways aren't in MSigDB, so for others I was getting them directly from KEGG.
I've realised that for at least one pathway, the lists of genes differ a lot between KEGG and MSigDB. For example, for apoptosis:
There are 87 genes in the MSigDB list and 136 in the KEGG list, and only 57 gene symbols appear in both lists.
I've also tried the msigdbr and gage R packages, and the lists that they give generally agree with MSigDB, using either gene symbols or Entrez IDs. It seems unlikely to be caused by one or more of the lists being outdated when there are so many differences between them, and anyway, MSigDB was last updated a couple of months ago. It also seems unlikely that three different secondary sources all agree with each other and are all wrong.
So the question is, which, if any, of these sources should I trust? Any suggestions would be appreciated!