GO analysis and the universe set of genes
1
2
Entering edit mode
6.2 years ago
tiago211287 ★ 1.3k

I started to analyse GO enrichment with my genes list using the R package clusterProfiler, that get data from DAVID. Then, a doubt came out about what set of genes should I use in my universe when looking for enrichment?

All annotated genes?

Only annotated genes expressed in the tissue I am studying? (Mine is heart mouse)

Only genes present in my GTF file?

Thank you.

GO R • 3.3k views
ADD COMMENT
2
Entering edit mode

Enrichment analysis is based on comparing two lists and what these two lists should be depends on the question one asks. Which question are you interested in ?
1 - Does my list contain more genes of type X than expressed in the mouse heart ?
2- Does my list contain more genes of type X than found in my GTF file ?
Depending on the context (i.e. what the list and the GTF file represent), both are valid questions.
Typically, the background gene list is made of all genes tested in the experiment.

ADD REPLY
0
Entering edit mode
6.2 years ago
Martombo ★ 2.9k
filtering for genes that are expressed in your samples is a good idea. for example: if you have a neuroblastoma cell line and you pick randomly some genes that are expressed in it, you'll probably find some enrichment in neurological functions. using as background only genes that are expressed should correct for this bias.
ADD COMMENT
0
Entering edit mode
of course the question of what threshold you use to say that a gene is expressed still remains.
ADD REPLY

Login before adding your answer.

Traffic: 1594 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6