Question: Gene Set Size - When Is It Too Small?
0
gravatar for PoGibas
5.4 years ago by
PoGibas4.8k
Vilnius
PoGibas4.8k wrote:

I have small subset of genes that have specific characteristic (e.g., TFBS in their UTRs). Checked enrichment in all set using permutation test (p value = 0). However, only small subset of genes have this TFBS and I don't know is it worth analyzing (e.g., expression, conservation) these genes as set is very small.


Example

Total number of genes in set = 20000
Number of genes with TFBS = 8
Permutation test p value = 0 (aka, all set (20000 genes) is enriched for this TFBS compared to a genomic background)


Questions

How to determine if set size is statistically valid (8 genes out of 20000)? Any test in R?
Is it worth analyzing such a small set of genes and try to show how interesting and important is their biology?

enrichment analysis subjective • 1.8k views
ADD COMMENTlink modified 5.4 years ago by Charles Warden6.6k • written 5.4 years ago by PoGibas4.8k
2
gravatar for Charles Warden
5.4 years ago by
Charles Warden6.6k
Duarte, CA
Charles Warden6.6k wrote:

Instead of a permutation test, a Fisher exact test or hypergeometric test is more commonly used to calculate gene set enrichment.

When doing something like GO enrichment (which should use a similar principle), I don't set a hard cutoff for number of genes in the original gene set (in GO), but I typically like to see highly significant values (such a p<1e-5) that should typically include multiple enriched genes within the deferentially expressed gene list (similar to your 2000 gene list, I assume). However, I return the entire list of results p<0.05. Sometimes biologists like to know if a single gene is affected (if that single gene is known to be really important).

BTW, you can try using the TRANFAC enrichment tool in GATHER if you have a list of official gene symbols:

http://gather.genome.duke.edu/

I personally like the upstream regulator function in IPA (based upon literature annotations rather than predicted motifs), but that is commercial software.

ADD COMMENTlink written 5.4 years ago by Charles Warden6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1069 users visited in the last hour