Question

Enrichment Analysis in R

2

Entering edit mode

5.7 years ago

Bayram Sarilmaz ▴ 50

I have two gene sets: A and B. I would like to check which genes in B are enriched in A. As a result of the enrichment analysis, I to have a p-value for each gene in B.

Here is a reproducible example that you can use: I'm performing the analysis of gene IDs.

A = data.frame(c(100,200,300,400,500,600,700,100,800,900,1000,100,500,100)) #Gene IDs in set A
B = data.frame(c(200,4,900,100,6)) #Gene IDs in set B

#check if B geneIDs are enriched in set A, and generate a p-value for the enrichment of each gene
go.obj <- newGeneOverlap(B,A)
go.obj
go.obj <- testGeneOverlap(go.obj)
print(go.obj)

In the above example I attempted using GeneOverlap package, but it didn't give me p-values for every gene in B. Any suggestions on other methods to achieve what I'm aiming for?

r enrichment genetics • 3.2k views

ADD COMMENT • link 5.7 years ago by Bayram Sarilmaz ▴ 50

2

Entering edit mode

You cannot generate per gene p-values for enrichment, only a "set" level enrichment, i.e. does set B overlap with set A more than expected? Maybe you could elaborate on your question/goal? To me, "I would like to check which genes in B are enriched in A", means which genes in B are also in A....

ADD REPLY • link 5.7 years ago by ejm32 ▴ 450

0

Entering edit mode

Thanks for explaining this! Then if the p-value from the above geneoverlap test is equal 0, it means there is no significant overlap between A and B?

ADD REPLY • link 5.7 years ago by Bayram Sarilmaz ▴ 50

1

Entering edit mode

You should question P values that are equal to zero, particularly when you're dealing with just 5 elements in your B object and when a visual inspection reveals that only three-fifths of B form a subset of A. Why not just report that, i.e., that 60% of B overlaps with A? Why do you need a P value when human eyes are sufficient? Presumably your actual gene lists are much larger?

ADD REPLY • link 5.7 years ago by Kevin Blighe 87k

0

Entering edit mode

See @Kevin's comment below about meaningfulness of your p-value. But to your direct question, No a p-value of 0 would mean that the overlap is statistically significant. I.e. there is a 0% chance that you would find this degree of overlap or greater if there was no association between the two lists.

ADD REPLY • link 5.7 years ago by ejm32 ▴ 450