How to filter out non-plant KEGG pathways in KEGG enrichment?
0
1
Entering edit mode
2.1 years ago
Ben Nestor ▴ 10

Hi all,

I'm doing KEGG enrichment of a non-model plant species. I annotated genes of this species using blastKOALA with taxonomy group set to Viridiplantae (ID 33090) and KEGG GENES database set to family_eukaryotes. Then I did enrichment of a specific gene set compared to the overall annotated genes using the enrichKEGG function of clusterProfiler (based on this thread).

enriched <- enrichKEGG(gene,organism="ko",keyType = "kegg", universe = hakea_kegg, pAdjustMethod = "BH", minGSSize = 10, pvalueCutoff = 0.05, qvalueCutoff = 0.05)

In the enriched KEGG terms I see KEGG pathways for non-plant pathways such as 'Axon regeneration', 'Tuberculosis', and 'Neurotrophin signaling pathway'. I'm not sure why this is because I've filtered the genes to only keep ones which have top hits to other plant sequences in NCBI NR. I'm only interested in plant related terms so is there a way to filter these out?

enrichment clusterprofiler KEGG plant • 951 views
ADD COMMENT
0
Entering edit mode

Hi did you figure out this problem? I am also having the same issue with my non-model fungal genes.

ADD REPLY
0
Entering edit mode

Hey sorry for the late reply I didn't see this until now!

I found a script that gives all the KEGG ids associated with plants. I assume it would work if you just changed the mentions of plants to fungi or whatever you want to filter for in the org dataframe

#In R
library(KEGGREST)
org <- data.frame(keggList("organism"))
plants = org[grep("Plants", org$phylogeny), ]
pathways_tot = vector()
for (i in 1:length(plants$organism))
{
  try({
    pathways = keggLink("pathway", plants[i,2])
    pathways = sub(paste(".*",plants[i,2], sep = ""), "", pathways)
    pathways = unique(pathways)
    pathways_tot = append(pathways_tot,pathways)
    pathways_tot = unique(pathways_tot) })
}
pathways_tot = paste0("ko", pathways_tot)
write.table(pathways_tot, "pathways_plant.txt")

#Messy formatting in bash
cat pathways_plant.txt | tr -d '"' | sed -r 's/(.*)(ko[0-9]+)/\2/' > pathways_plant.ids

# Then filter the enriched KEGG term file for these
cat enriched.csv | grep -Fw -f pathways_plant.ids > enriched_filt.csv
ADD REPLY
0
Entering edit mode

Hi, can you tell from where this script comes from? Thanks

ADD REPLY
0
Entering edit mode

I'm not sure where the script for KEGGrest came from as I don't have the original one anymore sorry. The bash formatting and filtering lines are by me though. I vaguely remember copying it from a KEGGrest guide, but I'm unsure.

ADD REPLY

Login before adding your answer.

Traffic: 2289 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6