gene set enrichment / GO
4
1
Entering edit mode
7.1 years ago
Zhenyu Zhang ▴ 370

I have a list of 1000 gene names (entrez ID) which is a subset of about 8000 genes expressed in a tumor tissue, and I want to look for GO enrichment and TFBS enrichment and so on.  

I know there are a variety of online tools available.  However, most of them only take these 1000 subset genes and give a p-value or FDR, which is very biased, because my whole gene set (8000 genes) is obvious enriched for different terms (for example, tumor) by itself.  Could anyone suggest an online tool or R package that can take two gene lists (one for interested genes, one for all genes), and query its database and pop out p-values and FDR?  Thanks.

GSEA Gene ontology TFBS • 2.6k views
ADD COMMENT
1
Entering edit mode
7.1 years ago

You want to set your background to be the set of 8000 genes (so, your Fisher exact test, hypergeometric test, etc. compares the proportion of genes within the 8000 genes and not all possible genes), not perform a separate enrichment test resulting in two gene lists. This is important for the reasons that you defined above.

As one example, I believe your background is defined as the optional "gene space" parameter in FuncAssociate:

http://llama.mshri.on.ca/funcassociate/

ADD COMMENT
1
Entering edit mode
7.1 years ago

There's a bioconductor view for GO that contains a number of popular tools. BTW, perform a standard differential expression analysis and then use the DE genes as the "enriched" set and the remainder as the background.

ADD COMMENT
1
Entering edit mode
7.1 years ago
Neilfws 49k

I've recently used the R/Bioconductor package gage. It does enrichment using either KEGG or GO terms. Very intuitive, easy to use and integrates nicely with other packages, such as pathview.

ADD COMMENT
1
Entering edit mode
7.1 years ago

You can use DAVID web-server (http://david.abcc.ncifcrf.gov/) for functional analysis, it allows to set a background set. But, in my experience, when doing functional analysis for lists of more than 200 genes you're always going to get some false-positive hits. Consider further filtering your list..

ADD COMMENT

Login before adding your answer.

Traffic: 2829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6