Gene Set Enrichment Analisys on Bioconductor?
2
0
Entering edit mode
7.4 years ago
eurioste ▴ 20

I'm just a beginner with gene set enrichment analysis, so if I'm making any mistake please let me know.

I'm working with a model for predicting functional effect some no coding variants. I simulated all the possible SNVs genome wide. Them I selected the variants scored by the model above a certain threshold and checked them for enrichment of several epigenetic marks, like histone marks, across several cell lines. My question is:

Are the genes with predicted functional variants intersecting different epigenetic marks the same? Are they the same for a mark among the different cell lines? I'm interested in knowing if the gene sets are similar, if they belong to similar GOs or not, and which ones.

I wish to compare this using the genes that contain the predicted functional variants as my background. Given a certain mark in a certain cell, they may or may not intersect with the mark. They may intersect with the same marks or intersect with different marks.

I'm starting from several gene ENSEMBL ids list. There is no expression value associated with these, just the presence of the gene in the set.

I tried using CompGO R package. The package works well but unfortunately it relies on DAVID functional annotations. I found DAVID web service to be very unreliable and limited because it only accepts up to 3000 genes per list. I no longer wish to use it.

Does anybody has a suggestion for an easy to use R package, as well as the best method for answering my question?

gene R enrichment jaccard • 2.0k views
ADD COMMENT
0
Entering edit mode

Adding 10 question marks to your post will not result in a quicker answer ;-)

ADD REPLY
0
Entering edit mode

I just have a small question, how big is your gene set, because 3000 unique genes is huge for analyzing, the number of functional clusters is hundreds and it is difficult to infer meaning from them.

ADD REPLY
0
Entering edit mode

I am interested in comparing the gene sets of different epigenetic markers to see if they are similar among themselves, in other words, if the genes that had mark A are the same that had mark B. Not all of my gene sets are that big but some marks like H3K36me3 are quite common.

ADD REPLY
3
Entering edit mode
7.4 years ago
Guangchuang Yu ★ 2.6k

clusterProfiler is a Bioconductor package that is user friendly and well documented.

ADD COMMENT
1
Entering edit mode
7.4 years ago
EagleEye 7.5k

Try GeneSCF (not bioconductor), it accepts any number of genes (I have to accept that more the number of genes GeneSCF becomes slow) and independent of DAVID annotation. Since it is realtime, it is more reliable than most of the tools.

ADD COMMENT

Login before adding your answer.

Traffic: 1467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6