Question: What to choose as background genes in GO enrichment analysis
0
gravatar for tianshenbio
4 days ago by
tianshenbio40
tianshenbio40 wrote:

I tried both clusterprofiler and goseq for GO enrichment analysis. However, I got 4-5 times more GO terms enriched in goseq than in clusterprofiler since p-values calculated in goseq are generally smaller. This is because I used all genes in the genome as background genes so the background ratio is small and p is also smaller. Then what genes I should use as background genes? here are some of my options:

  1. All genes in the genome
  2. All genes with GO terms in the genome
  3. Genes detected in my samples
  4. Genes detected in my samples with GO terms
ADD COMMENTlink modified 4 days ago by Papyrus210 • written 4 days ago by tianshenbio40
2

I personally use the genes that were actually used in the DE analysis, so in my case all genes that survive the FilerByExpr step of edgeR.

ADD REPLYlink written 4 days ago by ATpoint34k
1

I would use 3 or 4 depending on your input list, if it contains only genes with GO term then 4, if not (which is the correct way IMHO) then 3.

ADD REPLYlink written 4 days ago by Asaf7.6k
3
gravatar for Papyrus
4 days ago by
Papyrus210
Papyrus210 wrote:

I would use all of the genes that were analyzed in your experiment: these are those on which you performed the differential expression analyses (you maybe previously filtered them to remove low-expression genes, etc.), because (under assumption of independence) those are the ones which had a chance of appearing as DEGs.

Regarding filtering out genes which do not map to GO terms, you can control for this at the GO enrichment step: the goseq function can control this with its use_genes_without_cat argument. And by default (most recent version) these genes are ignored in the enrichment testing.

ADD COMMENTlink modified 4 days ago • written 4 days ago by Papyrus210

Thank you for your answer! Very helpful

ADD REPLYlink written 4 days ago by tianshenbio40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1436 users visited in the last hour