Find the most frequently appeared genes in a list of pathways and genes
2
1
Entering edit mode
4.7 years ago
Kim ▴ 20

Hello everyone

I have a list of pathways and genes involved in those pathways as follow (the real list is much longer):

enter image description here

I want to see which genes appear most frequently in these pathways. Do you know how to do that in R or recommend any tools?

Thank you very much

pathway most frequent dominant gene • 880 views
ADD COMMENT
0
Entering edit mode

Paste your data as text, please.

ADD REPLY
3
Entering edit mode
4.7 years ago
Benn 8.3k

You can use R packages tidyr and dplyr for this.

# First import into R
table.file <- read.table("your.file.txt", header = T, sep = "\t", stringsAsFactors = F)

library(tidyr)

# Get your genes in separate rows
table.genes.sep <- separate_rows(table.file, Submitted.entities.found, sep = ";")

library(dplyr)

# use dplyr to count genes and sort
table.genes.count <- table.genes.sep %>% count(Submitted.entities.found, sort = TRUE)
ADD COMMENT
1
Entering edit mode
4.7 years ago
zx8754 11k

Something like this, which will display most frequent 10 gene names:

head(names(table(unlist(strsplit(table.file$Submitted.entities.found, sep = ";")))), n = 10)
ADD COMMENT

Login before adding your answer.

Traffic: 2298 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6